CANADIAN 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


¢.“SrTy 


VOL. XII - NO. 3 OF MICHIGAN 
1960 JUL 18 1960 


MATHEMATICS 


A class of function algebras F.W. Anderson 353 
The equivalence of two extremum problems 

J. Kiefer and J. Wolfowitz 363 
On relatively invariant measures Mark Mahowald 367 
Sums of functions of digits B. M. Stewart 374 
A linear diophantine problem S. M. Johnson 390 
Arithmetical inversion formulas Eckford Cohen 399 
The number of k-coloured graphs on labelled nodes R. C. Read 
Discrete groups of motions Leon Greenberg 


Limits of lattices in a compactly generated group 
A. M. Macbeath and 8S. Swierczkowski 


Canonical forms for certain matrices under unitary congruence 
J. W. Stander and N. A. Wiegmann 


On nilpotent products of cyclic groups Ruth Rebekka Struik 
Traces of matrices of zeros and ones H. J. Ryser 


A new type of characteristic subgroup of prime-power groups 
H. R. Brahana 


A theorem on pure submodules George Kolettis, Jr. 
Nodal non-commutative Jordan algebras Louis A. Kokoris 
Generalized Lie elements Rimhak Ree 
Modifications and cobounding manifolds Andrew H. Wallace 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 


by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, P. Scherk 


with the co-operation of 


B. DeLury, J. Dixmier, W. Fenchel, H. Freudenthal, I. Kaplansky, 
S. Mendelsohn, C. A. Rogers, H. Schwerdtfeger, A. W. Tucker, 
W. J. Webber, M. Wyman 


D. 
N. 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Authors are 
asked to write with a sense of perspective aiid as clearly as possible, 
especially in the introduction. Regarding typographical conventions, 
attention is drawn to the Author’s Manual of which a copy will be 
furnished on request. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $10.00. This is reduced to $5.00 for individual members of recognized 
Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Mount Allison University Nova Scotia Technical College 
Queen’s University St. Mary’s University 
University of Saskatchewan University of Toronto 

National Research Council of Canada 

and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








A CLASS OF FUNCTION ALGEBRAS 
F. W. ANDERSON 


Introduction. A problem which has generated considerable interest during 
the past couple of decades is that of characterizing abstractly systems of real- 
valued continuous functions with various algebraic or topological-algebraic 
structures. With few exceptions known characterizations are of systems of 
bounded continuous functions on compact or locally compact spaces. Only 
recently have characterizations been given of the systems C(X) of all real- 
valued continuous functions on an arbitrary completely regular space X (1). 
One of the main objects of this paper is to provide, by using certain special 
techniques, a characterization of C(X) for a particular class of (not necessarily 
compact) completely regular spaces. 

Generally speaking, one of the primary difficulties in characterizing all of 
C(X) is that of obtaining conditions which insure that a subsystem is, in fact, 
all of C(X). Sets of conditions of two different types have evolved. The first, 
for X compact, uses the completeness of C(X) in its usual norm and the 
Stone-Weierstrass Theorem. (For example, see (10) and (13).) The second 
uses the fact that C(X) is, in a sense, maximal in a certain class of algebraic 


systems (cf. (1, 6). The first of these appears to be applicable only in situa- 
tions where C(X) possesses a norm or a suitable family of pseudo-norms. The 


second, although it applies in more general situations and is algebraic in 
nature, has the slight drawback of the 
condition. 


external’’ character of the maximality 


In this paper we characterize C(X) as a vector lattice, as an /-ring, and as 
an algebra! for the case in which X is a P-space (7). A feature of special 
interest in these characterizations is that we appeal to neither of the afore- 
mentioned methods for obtaining all of C(X); rather we use, for X a P-space, 
a simple property of certain “‘fixed’’ subsets of C(X). En route to obtaining 
these results we also characterize M(X, 8), the set of all real-valued measurable 
functions on a total measurable space, as a vector lattice and as an /-ring. 

In two recent papers, Brainerd ((4) and (5)) has also given characterizations 
of C(X), X a P-space, and M(X, %) as /-algebras. The characterizations of 
C(X) by Brainerd as an /-algebra and by us as an /-ring, although obtained 
independently, use essentially the same techniques. 


Received April 5, 1958. Presented to the American Mathematical Society June 20, 1958 

1For the theory of vector lattices and /-rings see Birkhoff (2), Birkhoff and Pierce (3), and 
Nakano (11). Our notation will be that of (2) except that yV and A will be used to denote 
lattice join and meet, respectively. By an algebra we shall always mean an algebra over the 
real field. 


353 











354 F. W. ANDERSON 


1. Preliminaries. If X is a set, denote by F(X) the set of all real-valued 
functions on X. If % is a Boolean o-algebra of subsets of X, then we say 
that the pair (X, B) is a total measurable space and denote by M(X, B) the 
set of all f € F(X) measurable %. If T is a base for a topology on X, then we 
denote by C(X,T), or in unambiguous cases simply C(X), the set of all 
f € F(X) continuous with respect to T. 

For each f € F(X) set Z(f) = {x € X;f(x) = 0}. A subset ZC X isa 
measurable zero set in case Z € %, or equivalently, in case Z = Z(f) for some 
f € M(X, 8). A subset Z C X is a continuous zero set in case Z = Z(f) for 
some f € C(X). 

A subset J C F(X) is fixed in case (\ {Z(f);f € I}, also written (\ Z(J), 
is non-empty. Let A C F(X). A set J C A is a maximal fixed subset of 4 if 
and only if J = {f € A; f(x) = 0} for some x € X. In general, different points 
in X do not give rise to different maximal fixed subsets of A; if, however, A 
separates points (that is, x # y in X implies 0 = f(x) ¥ f(y) for some f € A), 
then the mapping J — ()\ Z(J) is one-one from the maximal fixed subsets of 
A onto X. 

Let A C F(X). Then for each J C A, set 


a(I) = {f€ AsX —NZ(D CZ(f)}. 


Thus f € a(J) if and only if for every x € X and every g € J, f(x)g(x) = 0. 
We say that J C A is Z-convex (in A) provided that 


I={f€ A:NZ(D CZ(f)}. 


It is clear then that for each J C A, if J = a(a(J)), then J is Z-convex. The 
converse in general is false; for example, every maximal fixed subset J of A 
is Z-convex, but it need not satisfy J = a(a(J)). If 7 is the collection ofmaximal 
fixed subsets of A, then it is clear that J C A is Z-convex if and only if 
IT=N{NEFICN}. 

A topological space X (= (X,T)) is a P-space (7) provided that X is 
completely regular and that every G;-set in X is open. In such a space X the 
family of continuous zero sets of X is an open base for the topology, is a 
Boolean o-algebra of subsets of X, and coincides with the family of closed-open 
subsets of X. Conversely, if (X, B) is a total measurable space which separates 
points of X (that is, x # y in X implies x € E and y ¢ E for some E € 9), 
then % is an open base for a topology on X relative to which X is a P-space. 
Moreover, it is clear that in this case M(X, 8) C C(X, $B). We now prove 
a test for equality.” 


LEMMA 1.1. Let X be a P-space, let 8, a Boolean o-algebra of subsets of X, be 
an open base for the topology of X, and let -* be the set of maximal fixed subsets 
of M(X, B). Then M(X, B) = C(X, B) tf and only if for every Z-convex set 
IC M(X, 8), if N € Y implies I Z N or a(1) CN, then I = a(f) for some 
fe M(X, 8). 





*See also (5, Theorem 1) for a variation of this result. 

















A CLASS OF FUNCTION ALGEBRAS 355 


Proof. Let I © M(X, B) be Z-zoncvx. We shall prove first that the two 
conditions 

(1) 1Z Nora(I) ZN forall NE &% 

(2) (\ Z(J) is a continuous zero set; 
are equivalent. Assume (1). Then F,; = (\Z(J) and F, = (\Z(a(J)) are 
disjoint. For let x € X and let 


N, = {f € M(X, B); f(x) = 0}. 


Since N, is Z-convex, we have x € F, if and only if J C N,, and x € F; if 
and only if a(7) C N,. Thus, by (1), Fi (\ Fe = @. Since M(X, B) C C(X, B), 
we conclude that F; and F; are closed. Since $ is an open base, if x € X, then 
{x} = (\ Z(N,). Therefore if x ¢ F; and if f © M(X, B), then {x} = 1 Z(N,) 
Cc X — F, © Z(f) implies f € N,. That is, a(J) C N,, so that x € Fy. Hence 
X = F,U Fy». We have then that F; is both closed and open, and therefore 
F, = (\ Z(J) is a continuous zero set. Conversely, assume (2). Then since X 
is a P-space, F = (\ Z(J) is closed and open. Since $ is an open base, if 
x € F, then there is an f € M(X, 8) such that f(x) # Oand X — FC Z(f); 
that is, f © a(J) and f ¢ N,. Hence J C N, implies a(J) Z N,. Thus (1) and 
(2) are equivalent. 

We now easily prove the ‘“‘only if’ portion of the lemma. For suppose that 
M(X, B) = C(X, SB) and that J C M(X, 8) satisfies (1). Then F = (1) Z(J) 
is closed and open so that the characteristic function f of F is in M(X, %). It 
is evident then that J = a(1l — ff). 

Conversely, let g © C(X, 8) and let a be a real number. Set Z = {x © X; 
g(x) 2 a}. Then Z = Z((a@ — g) V 0) is a continuous zero set. Let 


1 = {f € M(X,%);ZCZ(f)}. 


Then J is Z-convex and (\ Z(J) = Z. Therefore J satisfies (2) and hence 
(1). Thus, if M(X, 8) satisfies the condition of the lemma, J = a(f) for some 


f € M(X, B). We claim that Z = X — Z(f). Certainly X — Z(f) € Z. Sup- 


pose then that x € Z(f). Since f € M(X, B) C C(X, 8), Z(f) is a continuous 
zero set, and therefore, since X is a P-space, Z(f) is open. Now & is an open 
base, so there is an h © M(X, B) such that h(x) # 0 and X — Z(f) C Z(h). 
Then hk € I and Z = (\ Z(J) C Z(h). Hence x ¢ Z, and we have the desired 
reverse inclusion X — Z(f) > Z. Now Z(f) is measurable since f € M(X, 8), 
and therefore its complement Z is measurable. Consequently, since a was 
arbitrary, we conclude that g is measurable 8 and hence that g € M(X, %). 
Thus M(X, 8) = C(X, 8) and the lemma is proved. 


2. Vector lattices of functions. In this section we characterize M(X, 8) 
and C(X, ZT) abstractly as vector lattices where (X, 8) is a total measurable 
space and (X, T) is a P-space. 

Let A be a vector lattice. For f, g € A we write f 1 g in case |f| A\g| = 0. A 
countable set {f,} of elements of A is a o+-set in case f, 2 0 (m = 1,2 ) 








356 F. W. ANDERSON 





and for each n # m, f, 1 fm. We say that A is o+-complete in case every : 
o+-set {f,} in A has a least upper bound, V,/f,, in A.* 


LEMMA 2.1. Let A be a vector sublattice of F(X) which separates points of X 
and contains the constant function 1. Then A = M(X, B) for some point separat- 
ing o-algebra B of subsets of X if and only if A is o+-complete and c-complete. 


Proof. The necessity of these conditions follows readily from the fact that 


if A = M(X, %), then the desired countable spurema are simply the “‘point- 
wise’’ suprema. 

Conversely, let A satisfy the stated conditions. If {f,} C A withf = V,f,€A, 
then we claim that f(x) = V,|[/,(x)] for each x € X. For suppose, on the 
contrary, that there is an x € X with f(x) > V,[/,(x)]. Without loss of 
generality, we may assume that, for all n, 0 < f, S fas: < 1 and f,(x) = 0, 
and that f(x) = 1. Now define sequences {g,}, {2,}, and {e,} in A by 

gi = 2f, Al and g, = 2(f,V ge-1) Al for n>1; 
h, = (2g, — Sn+1)7; 
and 


é: = hy, e2 = 2he, and e, = n(h, — hy-2) for n > 2. 





Also, for each n, set 


Y, = ty € X;g,(y) = 1}. 


Then one easily shows that, for each n, 0 S A, S has: S 1, &,(Y,) = 1, and | 
h,(X — Ynrsi1) = 0. From these it follows that 05e, <n, e,(x) = 0, | 
én(Y, — Yn-1) = m, and 


X — Zen) © Vasi — Va 


where Y_, = Yo = ¢. This implies that if |\m — n| > 2, then e,, 1 e,; hence 
each of the sets f{es,}, fes,-1}, and {es,-2} is a o4+-set in A. Therefore, since A 


° 
e= v(¥ Can )- V en 
=0 \n—1 n=1 ' 


is in A. Now if f,(y) > 0, then 2*f,(y) 2 1 for some k; therefore, since 


gnse(y) = [2*f.(y)] Al = 1, 


is ¢+-complete, 


SS SS 


we have y € Y,4,. That is, if 


P=U (X -—2Z(f,)), } 
n=1 
then 
PCU Y, = U (Yn — Yr-1). 
n=1 n=0 


‘Other, possibly less descriptive, terminology for this notion includes o-full (2) and complete 


(11). 





V 





and 


ence 


n plete 











ooo 





A CLASS OF FUNCTION ALGEBRAS 357 


Thus we have that e 2 1 on P, and consequently, that f S e on X. Hence 
there is an integer k = 2 such that e(x) S k — 1. Set 


k—1 
e’ = ( V ke.) Ve. 
i=1 


Then e’(y) 2 Rforally € Pande’(x) S e(x) S k. Therefore (e’ — k + 1)+21 
on P and, as a result, f S (e’ —k + 1)*+. This is a contradiction since 
(e’ — k + 1)*(x) = 0. We conclude then that f(x) = 0, and therefore count- 
able suprema in A, when defined, are defined pointwise. 

For each f € A, set e, = V, (|mf| A 1); then, by the result of the preceding 
paragraph, e, is the characteristic function of X — Z(f). Thus A contains 
e, and 1 — e, the characteristic functions of X — Z(f/) and Z(/), respectively. 
Now let 8 = {Z(f);f € A}. Then 8% is an algebra of subsets of X; for 
Z(f) U Z(g) = Z(\f| Al\g|) and X — Z(f) = Z(1 — e,). Since A is point 
separating, it is clear that % also is point separating. Moreover, % is a o- 
algebra; for, using the result of the first paragraph and the o-completeness 
of A, we have 

\, Z(fa) = (\_ Z(Os,.) = Z(Vales,) € B. 
We show next that A C M(X, %). Let f € A and let a@ be real. Then 
{x X; f(x) 2 a} = Z((a — f)*) %, 


so that f © M(X, %). On the other hand, A contains all measurable charac- 
teristic functions, and so, since A is o-complete, A contains all bounded 
f€ M(X, B). (Cf. (8, Theorem 20.B).) To complete the proof we need only 
show that 4 contains all non-negative f © M(X, B). So let f 2 Oin M(X, B). 
For each m = 1,2,..., set 


E, = {x€ X;n—1 8 f(x) <n} 


obviously f, € M(X, B) and is bounded; hence f, € A for all m. But {f,} is 
a o+-set, so that f = V,f, € A. Thus the proof of the lemma is complete. 
It is interesting to note that neither o+-completeness nor o-completeness 
alone is adequate to insure that A = M(X, %). For example, if X is uncount- 
able, then the set of all f © F(X) with f(X) countable is a vector sublattice 
of F(X) which is o+-complete but not o-complete. Next let X be the Stone- 
Cech compactification of an infinite discrete space and let A = C(X). Then A 
is a vector sublattice of F(X) which is o-complete but not o+-complete; in 
fact, there exist bounded sequences {f,} in A such that Z(V,f,) A OZ (fp). 
Let A be a vector lattice. An element e € A is a weak order unit in case 
for all f € A, |f| A \e| = 0 implies f = 0. A subset J C A is an ideal of A in 
case I is a linear subspace such that f € J and |g| S |f| implies that g € J. 


and let f, F(X) be defined by f, = f on E, and f, = 0 on X — E,. Then 


THEOREM 2.2. A vector lattice A is isomorphic to the vector lattice M(X, B) for 
some total measurable space (X, B) if and only if A is o+-complete, o-complete, 








358 F. W. ANDERSON 


has a weak order unit, and (\.Y = 0 where S is the set of maximal ideals of A. 
In fact, when A satisfies the stated conditions, A is isomorphic to M(-7% B), 
where B is a point separating o-algebra of subsets of % 


Proof. Since the family F of fixed maximal ideals (= maximal fixed ideals) 
of M(X, %) satisfies (\ F = 0, the necessity of the conditions is obvious. 

Conversely, let A satisfy the stated conditions. Let e € A be a weak order 
unit for 4; we may assume that e = 0. We claim that if F=a{Nner 
e¢N}, then (\.7 =0. For if f€ (\ F% then, for every N € X&% either 
f € N or e€ N. Thus (/f| Ae) € O-* so that |f| Ae = 0. Since e is a weak 
order unit, this implies f = 0. That is, © 7 = 0. By a familiar technique 
(1) we can define an isomorphism of A onto a point-separating vector sub- 
lattice A* of F(-7 ) such that ¢ is mapped onto the constant function 1. Appeal- 


ing to Lemma 2.1 we have that A* = M(-7,%) for some o-algebra 8 of 
subsets of -/. 
To complete the proof it will suffice to show that -7 = -7 and for this 


it will suffice to show that if (X, B) is a total measurable space, then no 
maximal ideal of M(X, B) contains 1. Suppose, on the contrary, that N is 
a maximal ideal of M(X, B) and that 1 € N. Then since N is proper, there 
is an f 2 0 with f ¢ NV. Since N is maximal and since f? > f is in M(X, 8), 
there is a real number a such that f? — af © N. Let 8 = }(@ + 1)”. Since 
1 € N, it follows that 8, and hence f? — af + 6, belongs to NV. But 


P—-of +8=[f-tet+)P4+f2f, 


contrary to f ¢ V. Thus the assumption 1 € N is untenable and the proof is 
complete. 


Let A be a vector lattice and let J C A. We set 
I+ = {f € A;f 1 g for all g € I}. 


Then clearly, J C J++. If -7 is a family of ideals of A, then an ideal J of A 
is -Acomplemented in case I = I++, and for each N € % either JZ N or 
I+ ZN. 

THEOREM 2.3. Let A be a vector lattice and let -/ be the set of all maximal 
ideals of A. Then A is isomorphic to the vector lattice C(X) for some completely 
regular P-space X if and only if A is o+-complete, c-complete, (\ / = 0, and 
for each complemented ideal I of A, I = {f\+ for some f € A. 


Proof. To prove the necessity we may assume that X is a Q-space (9); for 
if X is a P-space, then so is vX, and, of course, C(X) and C(vX) are iso- 
morphic. With this assumption the maximal ideals of C(X) coincide with the 
maximal fixed subsets of C(X). Moreover, if JC C(X), then J+ coincides 
with the set a(J) defined in § 1. These observations combine with Lemma 1.1 
and Theorem 2.2 to establish the necessity of the conditions in the present 
theorem. 








a 














A CLASS OF FUNCTION ALGEBRAS 359 


Conversely, let A satisfy the stated conditions. Since the zero ideal of A 
is clearly SFcomplemented, it follows that A has a weak order unit. Therefore, 
by Theorem 2.2, A is isomorphic to M(X, %) for some total measurable 
space (X, B) where, in fact, the maximal ideals -7 correspond to the maximal 
fixed ideals of M(X, B). A Z-convex set I* of M(X, B) is then the image of 
some I = (\{N € AIC N} in A, and therefore is an ideal of M(X, %). 
Since we clearly have a(J*) = (J*)+, it follows from Lemma 1.1 that 
M(X, 8) = C(X, B), and the proof is complete. 


3. f-rings of functions. In this section we characterize M(X,%) and 
C(X, T), (X, B) and (X, T) as before, as f-rings. Although these characteriza- 
tions still require o-completeness, we are able to dispense with the full force 
of the ¢+-completeness requirement. In its place we use ring regularity and a 
condition of countbable character on certain ideals. These characterizations 
are slightly sharpened versions of those given in (4 and 5). 

Recall that an f-ring (3) is a lattice-ordered ring A with the property that 
for all f,g,h € A, f Ag = 0 and h 2 O together imply hf Ag = fh Ag = 0. 
Clearly M(X, 8) and C(X, T) are f-rings. 

A ring A is regular (12) in case for each f € A, there is an f’ € A such 
that fff = f. It is known (7) that a completely regular space X is a P-space 
if and only if C(X), as a ring, is regular. Concerning regular f-rings we prove 
the following result which may be of independent interest. 


LEMMA 3.1. Let A be a regular f-ring. Then 
(1) For all f,g © A, \f| A\g| = 0 af and only if fg = 0. 
(2) If A has a weak order unit, A has an identity. 


Proof. Since A has no non-zero nilpotent elements, the /-radical of A is 
zero (3). Therefore (1) follows from (3, Corollary 1, p. 57) and (3, Corollary 
2, p. 63). Next let e 2 0 be a weak order unit for A and let ee’e = e. Then, 
by (1), f A ee’ = 0 implies fee’e = fe = 0 which implies |f| A e = 0 and thus 


f = 0. That is, the idempotent e’’ = ee’ is also a weak order unit. Let f € A; 
; if : 


then (fe’’ — f)e’’ = 0 implies | fe’’ — f| A e’”’ = 0. Therefore, since e’”’ is a weak 
order unit, fe’ = f. Similarly, e’’f = f, which establishes (2). 

An ideal J of a ring A is o-closed in case for every countable set {f,} CG J 
there is an f € A with ff, = f,f = f, for all n. 


THEOREM 3.2. Let A be an f-ring and let * be the set of o-closed maximal 
ring ideals of A. Then A is isomorphic to the f-ring M(X, B) for some total 
measurable space (X, B) if and only if A is regular, o-complete, has a weak order 
unit, and (\ / = 0. Moreover, if A satisfies these conditions, the space (X, B) 
and the isomorphism of A onto M(X,%®) may be so chosen that the set / is 
mapped one-one onto the maximal fixed subsets of M(X, B). 


Proof. The necessity of the conditions is easily proved; we omit the details. 
Conversely, let A satisfy the stated conditions. Then, by Lemma 3.1, A has 








360 F. W. ANDERSON 


a ring identity e. Moreover, since A is o-complete, it is Archimedean (2, p. 
229), and therefore A is commutative (3, Theorem 13). Since the regular 
e-complete subring of A generated by e is isomorphic to the ordered field R 
of real numbers, we may regard A as a regular f-algebra over R (that is, A 
is a regular F-ring in the sense of (4)). Now let N S and {f,} C N such 
that V,f, € A. Since N is o-closed, there is an f € A with ff, = f, for all n. 
By the regularity of A we may assume that f is idempotent. Then (11, 
Theorem 25.1), Vale = Valfe = f(Vafe) € N. Therefore A satisfies the 
conditions required in Brainerd’s characterization (4, p. 682). Thus there 
exist a total measurable space (X, 8) and an isomorphism of A onto M(X, %) 
with the desired properties. 

Let A be a ring. For J C A, set a(J) = {f € A; fg = 0 for all g € J}. In 
general a(J) is a left ideal of A; if A is commutative or if A is a regular f-ring, 
then a(J) is a two-sided ideal.‘ A left ideal J of A is a-principal in case I = a(f) 
for some f € A. 

If is a family of ideals of a ring A, then an ideal J of A is “complemented 
in case J = a(a(J)) and for each N € / either J Z N or a(J) ZN. 


THEOREM 3.3. Let A be an f-ring and let / be the set of o-closed maximal 
ring ideals of A. Then A is isomorphic to the f-ring C(X) for some P-space X 
if and only if A is regular, o-complete, (\ / = 0, and every complemented 
ideal of A is a-principal. 


Proof. To prove the necessity, we may assume that X is a Q-space. Then 
an application of Theorem 3.2 and Lemma 1.1 completes this portion of the 
proof. 


Conversely, since the zero ideal of A is -~complemented, it is a-principal. 
But from {0} = a(f) and Lemma 3.1 we conclude that |f| is a weak order 
unit for A. Therefore, by Theorem 3.2 and Lemma 1.1, we have that A is 
isomorphic to M(X, B) and that M(X, 8) = C(X, B) where (X, 8) is a 


P-space. 


4. The algebra C(X). With no assumptions concerning order properties 
it seems to be difficult to obtain a reasonably simple characterization of the 
algebras M(X, $B). It is possible, however, to characterize the algebra C(X), 
X a P-space, and it is the object of this section to present such a characteriza- 
tion. 


SA 


Let A be a ring and let -7 be a family of ideals of A. A set {f.} ‘ isa 
discrete cover in case a ~ B implies f.fs = 0 and the set {f.} is contained 
in no member of -~ We say that A is -“regular in case for each discrete 
‘FAcover {fa} in A there is an f € A such that f.ffe = fa for all a. 

The condition of ‘regularity provides the means by which we avoid order 
assumptions in the characterizations of C(X). In general, however, it is not 


‘If A isa subring of F(X), then a(J) as defined here coincides with a(J) as defined in §1. 





Nee 











oo ww 














A CLASS OF FUNCTION ALGEBRAS 361 


suitable for a characterization of M(X, B). For example, let X be uncountable 
and let 8 be the algebra of countable sets and their complements. If -7 is 
the set of maximal fixed ideals of the algebra M(X, B), then M(X, 8) is not 
Sregular. In fact, there is no algebra M(X,%) C A C F(X) other than 
F(X) itself which is regular relative to its set -7 of maximal fixed ideals. 


THEOREM 4.1. Let A be an algebra and let - be the family of o-closed real 
ideals* of A. Then A is isomorphic to the algebra C(X) for some P-space X if and 
only if A is regular, (\ * = 0, and each complemented ideal of A is 
a-principal. 


Proof. Let X be a P-space. Again we may assume that X is also a Q-space; 
hence every real ideal of C(X) is fixed. As before one easily proves that each 
such ideal is o-closed. Thus, clearly, (\ -7 = 0. If {fa} © C(X) is a discrete 
Fcover, then the family {X — Z(f.)} is a disjoint open cover of X; hence 
f = Lafa is in C(X). Since C(X) is regular, there is an f’ € C(X) with f’f? = f. 
Now an obvious pointwise argument shows inat f’f,? = f, for each a; therefore 
C(X) is Sregular. That C(X) satisfies the final condition follows from 
Lemma 1.1. 

Conversely, let A satisfy the stated conditions. Then, as a subdirect sum 
of fields, A is commutative. Since the zero ideal of A is -A~complemented, 


there is an f € A such that {0} = a(f). If f € N for some N € % then, since 
N is a-closed, there isa g © N with fg = f. Leth € A;then/f(h — gh) = 0. But 
{0} = a(f), so we have h = gh € N; that is, A = N. This contradiction shows 
that {f} is a discrete “cover. Then since A is ‘regular, f’f? = f for some 
f’ © A. Thus {0} = a(e) for some idempotent e (= f’f) in A; in fact, e is easily 


seen to be an identity for A. Therefore (cf. (1)) we may assume that A is 
(isomorphic to) a subalgebra of C(X) for some completely regular space X 
and that (i) the maximal fixed subsets of A are the members of -7, and (ii) for 
each x © X and each neighbourhood U of x, there is an f € A such that 
f(x) = 0 and f(y) 2 1 for all y ¢ U. It therefore remains to prove that X is 
a P-space and that A = C(X). So let U = (),U, be a Gs-set in X, let x © U, 
and let 


N, = {(f € Asf(x) =O} € F 


By (ii), there is, for each n, an f, € N, such that f,(y) 2 1 for all y ¢ U,. 
Since .V, is o-closed, there is an f © N, such that ff, = f, for all m. It is clear 
that f(x) = 0 and that f(y) = 1 for all y ¢ U. Consequently U is a neigh- 
bourhood of x. This establishes that X is a P-space. 

Now let Z C X be a continuous zero set; that is, Z is closed and open 
in X. For each x € X — Z, there is an f € N, such that f(Z) = 1. Therefore 
g=1-—f € Aandg(x) = landg(Z) = 0. Wehave from thisthat 7 = {f € A; 


~— 


ZC Z(f)} is Z-convex and (\ Z(J) = Z. Then with essentially the same argu- 


5An ideal N of A is real if A/N is isomorphic to the real field. 











362 F. W. ANDERSON 


ment as that used in the proof of Lemma 1.1, we conclude that J is -~com- 
plemented. Therefore J = a(f) for some f € A; thus, using the fact that X 
is a P-space, it follows that Z = Z(f) for some f € A. Since X — Z is also a 
continuous zero set, X — Z = Z(g) for some g € A. This clearly implies that 
{f, g} is a discrete cover; hence f’g? = g for some f’ € A. Thus (f’g)(Z) = 1 
and (f’g)(X — Z) = 0. We have proved then that A contains the characteristic 
function of each continuous zero set of X. 

To complete the proof it will suffice to prove that A contains every strictly 
positive function in C(X), for if f € C(X), then f = [(f V0) +1] — [—(Uf/A0) 
+ 1]. So let f © C(X) be strictly positive and for each positive real number 
a, set Z, = {x € X; f(x) = a}. Then each Z, is a continuous zero set; let 
éa © A be the characteristic function of Z,. Since {Z,} is a disjoint cover of 
X, it follows that {ae,} is a discrete -~cover in A. Therefore there is an 
f’ € A such that (ae,)?f’ = ae, for each a. Then for each x € Za, 


f' (x) = a = [f(x)]}-. 


Since {Z,} covers X, it follows that {f’} is a discrete “cover in A. Thus 
there is an f” € A with (f’)?f” = f’. Clearly then f” = (f’)-' =f and f ¢ A 


as desired. 


REFERENCES 


1. F. W. Anderson and R. L. Blair, Characterizations of the algebra of all real-valued continuous 
functions on a completely regular space, \linois J]. Math., 3 (1959), 121-133. 

2. G. Birkhoff, Lattice theory (rev. ed., New York, 1948). 

3. G. Birkhoff and R. S. Pierce, Lattice-ordered rings, An. Acad. Brasil. Ci., 28 (1956), 41-69. 

4. B. Brainerd, On a class of latitice-ordered rings, Proc. Amer. Math. Soc., 8 (1957), 673-683. 

5 

6 





F-rings of continuous functions I, Can. J. Math., 11 (1959), 80-86. 
. K. Fan, Partially ordered additive groups of continuous functions, Ann. Math., 51 (1950), 
409-427. 
7. L. Gillman and M. Henriksen, Concerning rings of continuous functions, Trans. Amer. 
Math. Soc., 77 (1954), 340-362. 
8. P. R Halmos, Measure theory (New York, 1950) 


9. E. Hewitt, Rings of real-valued continuous functions. I, Trans. Amer. Math. Soc., 64 (1948), 


45-99. 

10. S. Kakutani, Concrete representation of abstract (M)-spaces, Ann. Math., 42 (1941), 994 
1024. 

11. H. Nakano, Modern spectral theory (Tokyo, 1950). 

12. J. von Neumann, Regular rings, Proc. Nat. Acad. Sci. U.S.A., 22 (1936), 707-713. 


13. M. H. Stone, A general theory of spectra. II, Proc. Nat. Acad. Sci. U.S.A., 27 (1941), 


83-87. 


University of Oregon 





us 


ous 


48), 


394- 


)41), 








THE EQUIVALENCE OF TWO EXTREMUM PROBLEMS 
J. KIEFER anv J. WOLFOWITZ 


1. Introduction. Let fi,...,/, be linearly independent real functions 
on a space X, such that the range R of (fi,...,f,) is a compact set in k- 
dimensional Euclidean space. (This will happen, for example, if the f, are 
continuous and X is a compact topological space.) Let S be any Borel field 
of subsets of X which includes X and all sets which consist of a finite number 
of points, and let C = {£} be any class of probability measures on S which 
includes all probability measures with finite support (that is, which assign 
probability one to a set consisting of a finite number of points), and which 
are such that 


m,,(¢) = J fCof tx) (ax) ay Ss eer, 

x 
is defined. In all that follows we consider only probability measures — which 
are in C. Write M(é) for the k X k matrix ||m,,(&)||. When M(é) is non- 


singular, write [M(&)]-' = ||m‘||. (We shall not always exhibit dependence 
on £.) Letting f(x) denote the column vector with components f;(x), and 
letting primes denote transposes, we define 


d(x; t) = f(x)’[M(é)}-'f(x) 


whenever M(é) is non-singular. 
We consider two extremum problems. The first is to choose £ so that 


(1) £ maximizes det M(é). 


The second is to choose £ so that 


(2) £ minimizes max d(x; &). 
Zz 
We also note that the integral with respect to & of d(x;£) is k; hence, 
max,d(x;) > k, and thus a sufficient condition for ~ to satisfy (2) is 
(3) max d(x; £) = k. 
z 


The result of this note is that (1), (2), and (3) are equivalent. This result, 
which seems to have interest per se, also strengthens and extends results of 
the authors (1) on the optimum design of regression experiments. A brief 
description of the connection with the design of such experiments is given 
below. The proof of the theorem is elementary and brief. 


Received March 30, 1959 Research of J. Kiefer was sponsored by the Office of Naval 
Research. Research of J. Wolfowitz was supported by the United States Air Force under 
Contract no. AF 18(600)-685 monitored by the Office of Scientific Research. 


363 











364 J. KIEFER AND J. WOLFOWITZ 


2. The theorem. For every £ consider M(é) as a point in Euclidean 
k®-space, let T be the totality of such points for all ¢ in C, and let T be the 
convex closure of T. It is clear that every extreme point of 7 can be achieved 
by a £ which assigns probability one to a single point. Since C contains every 
¢ with finite support, it follows that T = 7. The class C need not, of course, 
be convex. However, since our argument will be concerned only with the 
M(é), we may argue below as if C were convex. Thus, if £; and £2 are in C and 


Ei + & 
2 


u(t) 


because there exists a £ in C with finite support, say £3, such that 


M(é) = u(i+#) 


is not, we may still discuss 


Moreover, if H — 1 is the dimension of the linear space spanned by the 
functions ff,, i <j, any M(&) is equal to an M(é’) where the support of 
¢’ consists of at most H points. This can often be impoved, as in the case 
where X is the unit interval and f;(x) = x‘. 

Call a subset D of C linear if the following condition holds: For every a, 
0 <a <1, and every pair &, & in D, aé, + (1 — a)& is in D whenever 
it is in C. Thus, if C is convex, D is also convex. 

We shall prove the following: 


THEOREM. Conditions (1), (2), and (3) are equivalent. The set B of all & 
satisfying these conditions is linear, and M(é) is the same for all &— in B. 


This result has a function space corollary which may be of interest. Suppose 
£ satisfies (3) and that Q is a real k X k matrix such that QM(£)Q’ is the 
identity. Then g = Qf is a vector of orthonormal functions with respect to 
£, and g(x)’g(x) = d(x; £). Thus we have 


Coro.uary. If f;,...,f, are linearly independent, continuous, real functions 
on a compact space X, then there is a probability measure — on X and a linear 
transformation g; = >. @isf; such that g,,..., ge are orthonormal with respect 
to & and 

k 
max 7 gi(x) = k. 


i=1 


The set of all such & is the set B of the theorem. 


Proof of the theorem. We shall say that & is a local solution of (1) if 
det M(é) > 0 and if, for every %’, 




















EQUIVALENCE OF TWO EXTREMUM PROBLEMS 365 


(4) = hog det M({1 — alf + at’) leno, < 0. 


Now, if det M(é) > 0, A is such that A M(é)A’ is the identity, and A M(#’)A’ 
is diagonal with diagonal elements b,, then det M([1 — alé + at’) = det A~? 
II,{1 — a + ab,|, from which we easily compute that — log det M({1 — alé 
+ at’) is convex in a(0 < a < 1) and is strictly convex unless all 5, = 1 
(that is, unless M(é) = M(é’)). Hence, if det M(t’) > det M(£), equation (4) 
cannot hold for that ¢’. We conlcude that local solutions of (1) are actual 
solutions of (1), and of course the converse is true. Moreover, if det M/(£) 
= det M(t’) = h > 0, we have det M(é/2 + #’/2) > h unless M(t) = M(¢’), 
so that £ and &’ cannot both satisfy (1) unless M(é) = M(é’). It follows from 
this and the linearity in & of M(&) that, if — and & both satisfy (1), then so 
does at + (1 — a)é’, whenever it is in C. 

It now suffices to prove that det M() > 0 and € satisfies (4) for all &’, if 
and only if & satisfies (2), and only if it satisfies (3). First suppose £ satisfies 
(4) and that det M(é) > 0. Performing the differentiation in (4), and denoting 
by M,, the cofactor of m,,, we have 


(5) O> {det ver >d 9 det M dm,,;([1 — aE + af’) 





ij Omi; da a=() 
. fi) 
= [det M(é)y* DO (2 » mieMu) lon 8) — mi(é)] 
i,j Mis @ 


= [det M(é)y" pi M ;;(€) (mi, (*’) = m «;(€)] = > m"*(€)m,,(€) —_ k. 
i,j 1.7 

Letting ¢’ give measure one to the point x, we obtain 

(6) [f(x))’/M(E)-'f(x) < Rk 


for all x. Thus, (3) is satisfied and, as we have remarked, this implies (2). 

Finally, if (2) is satisfied, we must have (6) for all x, since we have just 
seen that there always exist ~’s satisfying (3). Hence, for any ~’ with finite 
support, we obtain >> ;, m‘/(&)m,,(’) < k. Hence this inequality is valid for 
all ¢’, and (5) is satisfied. This completes the proof of the theorem. 


3. Extensions and applications. We remark that it is easy to see that, 
if R is bounded but not compact, and if {£;} is a sequence of measures on S, 
then lim det M(é,) isa maximum if and only if lim; sup, d(x; £;) isa minimum, 
and if and only if lim, sup, d(x; &;) = k. Similarly, the first part of the corollary 
holds with the replacement sup, >> ¢?(x) < k + «, for any « > 0. 

We now describe briefly the statistical applications of the results. An 
integer N is given, and the statistician must choose V points x;,...,Xy (not 
necessarily distinct) corresponding to which he obtains observations on un- 
correlated random variables Y; (1 < i < N) with common variance oe? (per- 
haps unknown) and with expectation >>4_,0,f,(x,), where the 6, are unknown 











366 J. KIEFER AND J. WOLFOWITZ 


real parameters. If —(x) denotes the proportion of x,’s which are equal to x, 
we find that the covariance matrix of best linear estimators of 6;,..., is 
N-'o?[M(é)|-'. The function é is called the experiment or the experimental 
design. A criterion often adopted for choosing a design is to minimize the 
determinant of the above covariance matrix (the “‘generalized variance’). 
Another possible criterion is to minimize the maximum over x of the variance 
N-'o*d(x; &) of the “‘best linear estimator,” given &, of the “regression func- 
tion” > ,f,(x). If we consider not merely the class Cy of probability measures 
— which take on only integral multiples of V~' as values, but rather ail prob- 
ability measures — in C, then our result is that the two optimality criteria 
are equivalent. Moreover, for any — with support on H points which satisfies 
(1), (2), and (3), there is clearly a &’ in Cy which achieves (1), (2), and (3) 
to within a multiplicative factor 1 + 0(\~-'), and is easy to write down 
from £. Since the exactly optimum designs are often difficult to obtain, depend 
on JN, and differ for the two criteria, we see the practical importance of our 
considerations. 

It is very helpful to use the interplay of the two criteria (1) and (2) in 
obtaining a solution. For example, one can sometimes guess that a solution 
exists which is a member of a class of § which depend on several parameters. 
One may use (1) as the more convenient initial approach, maximize det M (&) 
over the parametric class, and then verify whether the maximum just obtained 
is indeed a maximum over all — (which may be difficult in terms of (1)) by 
verifying (3). It is useful to note that, if — has a set consisting of k points 
as its support, then it gives equal measure to each of these points. (This 
is part of Theorem 5 of (1).) Examples which make use of such methods 
will appear elsewhere, as will generalizations such as one concerned with the 
minimization of the determinant of a principal minor of M(é)-. 


REFERENCE 


1. J. Kiefer and J. Wolfowitz, Optimum designs in regression problems, Ann. Math. Stat., 30 
(1959). 


Cornell University 











30 





ON RELATIVELY INVARIANT MEASURES 
MARK MAHOWALD 


1. Introduction. In this note we will discuss the question of the measura- 
bility of the multiplier function of a relatively invariant measure on a group. 
That is, for a group G, o-ring S, and a measure u defined on the sets of S, we 
assume: E in S, x in G implies xE is in S and w(xE) = o(x)u(E) and study 
the measurability of the function (x). 

The problem was discussed by Halmos (1, p. 265), on locally compact 
groups and there the situation proved to be as nice as it could be, that is, if 
the measure is a non-trivial, relatively invariant Baire measure then the 
multiplier function is continuous. We prove two theorems for groups in which 
no topology is assumed. In the first theorem we assume a shearing condition 
and answer the question completely. The second theorem places a condition 
on the measure and weakens the shearing assumption. Its proof is compli- 
cated and occupies the major portion of this paper. 


2. Definitions and Notation. We shall use the measure-theoretic nota- 
tion and definitions of (1) with these modifications and additions. All measures 
which are considered are complete. 

2.1. A left-invariant ring, R, is a ring of subsets of a group, G such that 
E in R implies xE is in R for all x in G. 

2.2. When we say a function, f, is S-measurable we mean that for E in S 
and M a Borel set of the real line, E (\f-'(M) (C\ N(f) is in S. (N(f) = 
{x:f(x) # O}.) 

2.3. (G,S,) will be a measure space such that G is a group and S is a 
left-invariant o-ring of subsets. 

2.4. If E and xE are measurable and u(xE) = o(x)u(E£) and if uw is not 
identically equal to zero and is o-finite then yw is called relatively invariant 
and will be denoted by (c)u. Note that the definition of o(x) implies that 
0 < a(x) < ~,all x € G,o(xy) = o(x)o(y) = o(yx),o(e) = 1,o(x)o(x—') = 1. 

2.5. By H(S) we shall mean the hereditary o-ring generated by S. 

2.6. In (G, H(S), (c)u*) we shall define an outer measure integral denoted 
by 6*(E£) = S* af (x)du*, where f is an arbitrary non-negative function on G 
and 

n2" 
8*(E) = lim }> (i — 1)2"u* (Ey, AO E) 
Raw i=l 

Received March 11, 1959. This paper is part of a thesis submitted to the University of 
Minnesota. The author is indebted to Professor Gelbaum for his help and guidance during 
the preparation of this paper. In addition, the author wishes to thank the referee for helpful 
suggestions, particularly in the proof of Theorem 2. 


367 











368 MARK MAHOWALD 


where E,, = {x: (¢ — 1)2 < f(x) < 12 for 1 =1,...,m2"}. Note that 
0 < a(x) < © implies that if 6* = f*o(x—)dy*, then 6*(£) = 0 if and only 
if w(EZ) = 0. 

2.7. (G, S) will be said to satisfy the shearing condition if the transformation 
from G X G to G X G defined by 6(x, y) — 6(x, xy) is a measurability pre- 
serving transformation, (carries S X S onto S X S). 

2.8. By weak shearing we shall mean that if f(x) is S-measurable then 
g(x, y) = f(xy) is S X S-measurable. 

2.9. By condition A on a measure space we shall mean that the space is 
the union of a disjoint class Y of measurable sets of finite measure with the 
property that every measurable set may be covered by countably many sets 
of Z and a set of measure zero. 

Remark. According to Halmos (1, p. 132) this implies that the Radon- 
Nikodym theorem is valid. 

2.10. We say that (G,S,u) is countably coverable if for every set E of 
positive measure and any other measurable set F, there exist x;,i = 1,2,... 
such that F — /x,E has measure zero. 

Remark. Lebesgue measure is countably coverable. 

2.11. By a measure group we shall mean a measurable space (G,.S) such 
that G is a group and S is left-invariant and satisfies the shearing condition. 


3. Measurability theorems. 


THEOREM 1. Let (G, S) be a measure group and let (c)u be a relatively invariant 
measure defined on S. Then o is S-measurable. 


Proof. From the definition of shearing we have, for any subset E of G X G, 
(0(E)), = xE,. (See (1), p. 258.) Let E = FX F, where F is in S. By Fubini’'s 
theorem we have that 


J xendu(y) = w((O0(E))-) = w(xE,) = o(x)u(F) xe 


is a measurable function of x. Therefore, o(x)u(F)xr is measurable but u(F) 
is a constant and F is an arbitrary set in S; hence o(x) is S-measurable. 


COROLLARY. In a measure group the existence of one non-trivial measure (oc) u 
implies that any other non-trivial (c’)y’ can be written as 


pw (E) = K | o’/o du. 
7 


Proof. The theorem implies that both o and o’ are measurable. Let 
6(E) = fro(x—)dp and @’(£) = f go’ (x—)dy’. Both @ and @ are invariant 
measures and (G, S, @) and (G, S, 6’) are measurable groups (see (1), page 257). 
Therefore Theorem 60:B of (1) applies and shows that Ké@ = 6’. Let 





eal 


ti 








— 





ON RELATIVELY INVARIANT MEASURES 369 


M 
A = > Anm X Bam 


m=1 


be a sequence of simple functions monotonically converging to o’. Then 


p'(E) = J a’ (x)dé’ = lim f fd’ 
n> a gE 


MW . . 
. * 1 , ; 
= lim K > Gan | a(x )du = K | (o’/o) du. 
n—rcx YJ EnmNk 7k 


m=) 
For Theorem 2 we shall need the following lemmas: 


LEMMA 1. For arbitrary non-negative function f on G, 5*, the outer measure 
integral of f in (G, H, (S), u*), is an outer measure on H(S) and the o-ring of 
u*-measurable sets is contained in the a-ring of 5*-measurable sets. 


Proof. The fact that 6* is an outer measure follows immediately from the 
definition. Let E be u*-measurable. Then for arbitrary A € H(S) we have 


n2" 


8*(A) = lim > (i — 1)2*u*(A TN Ey) 


naw i=l 


n2" 


lim >) (i — 1)2"[w*(A 1) Eni E) + u*(A 1 Eni E’)) 


i—1 


n—» 


(ANE) + R8(AN E’). 


This completes the proof of the lemma. 


II 


LEMMA 2. If 6*(E) = f*xfdu*, then f is R-measurable, where R is the collection 
of 5*-measurable sets. 


Proof. It is sufficient to show Ey, satisfies the Carathéodory criterion for 
all A € H(S). For N and j fixed and n > N, we have either E,;(\ En, = @ 
or E,;. Therefore, for arbitrary A H(S) we have 


n2-! 


8*(A) = lim }> (i — 1)2"(u*(Ewy NV A 1) Enid) + u*(E'vny NA 1 E)) 


naa it—1 


= 5*(A () Ex;) + 8*(A 2 E'y)). 


LEMMA 3. 6*(E£) = f* 20 (x—)du* is an invariant outer measure on H(S) and 
the restriction of 5* to S is an invariant measure on S. 


The proof of this lemma is long and will be given in § 4. We now have 
this 


THEOREM 2. Let (G, S, (c)u) satisfy condition A and be countably coverable 
and suppose that there exists a set E € S such that 0 < 8*(E) < @, (with & 
as in Lemma 3). Then there exists a o-ring R containing S and a measure (c) 
un on R which is an extension of w on S such that o is R-measurable. If in addition 
S satisfies the weak shearing condition then a(x) is S-measurable. 








370 MARK MAHOWALD 


Proof. By Lemma 3, &* restricted to S is a measure. Since 6*(£) = 0 if 
u(E) = 0, &* <u and condition A then implies the Radon—Nikodym theorem 
is valid. Let f be the R— N derivative. Let E € S be such that 0 < &*(E) <@. 
Let A be any set of Y. There exist {x,},i = 1, 2,..., 1 such that Ux,E D A. 
Therefore, on A, 6 is o-finite. Hence f can be chosen to be finite-valued on A, 
hence on G. On each subset F of A such that 6*(F) < ©, we have, 


F 


6*(F) = J S(y)dp = &*(xF) = | f(y)dp = o(x™') | f(x "y) dy. 
w/rF F 
Therefore, for each x 
(1) f(y) = o(x—")f (xy), {u] in y for y in A. 


Since the A are disjoint and a countable union of them cover any measurable 
set to within a set of measure zero the formula is valid for all x, [y], when 
y is restricted to any measurable set. 

(1) implies 7 = { (f(x))~'dé is a relatively invariant measure with 6 as the 
multiplier function on the o-ring of 6* measurable sets R. Lemma 2 shows 
that o is R-measurable. Therefore, we have only to show that Z is an extension 
of uw. Using Theorem B, page 134 of (1), we have 


| (f)"dé =| (fy) ‘fdu = u(E), 
E E 


for every E in S. Therefore, @ satisfies the theorem and this completes the 
proof of the first part of the theorem. 


The weak shearing condition implies that {(y)o(x—') —f(xy) = g(x, y) is RX R- 
measurable. On every set in R X R, g(x, y) is integrable and its integral will 
be zero by (1) and the Fubini theorem. Let A be any set in D with u(A) > 0. 
Then 6*(A) > 0 and A contains a set of points of positive u-measure at 
which 0 < f(y) < . Let E be the subset of A X A for which f(y)o(x~") 
— f(xy) # 0. Then g@ X @(E) = 0. Therefore, for almost all y in E, 


If A, = {x:f(y)o"'(x) — f(xy) = 0}, @(A,) = 0 for almost all y in A by 
the Fubini theorem. If @(A,) = 0, then 6*(4,) = 0. Whence »(A,) = 0, using 
2.6. Thus there exists y € E with 0 < f(y) < @ and such that f(y)o—'(x) 
— f(xy) = 0 for almost all x in A[u]. The measurability of f(xy) then implies 
that o(x) is measurable in A, and the definition of A implies that (x) is 
S-measurable. 


4. Proof of Lemma 3. We shall prove a sequence of remarks which 
will lead to the lemma. 





0 if 
orem 
< oe, 
> A. 


n A, 


rable 
when 


is the 
shows 
nsion 


»s the 


RX R- 
al will 
)>0. 
ure at 
o (x) 


A by 
, using 
1a! (x) 
mplies 
a(x) is 


which 


ON RELATIVELY INVARIANT MEASURES 371 


REMARK 1. u*(xE) = o(x)u*(E) for all E in H(S). 


Proof. This statement is an immediate consequence of the definition of an 
outer measure and the relative invariance of uy. 

In the following let E be any set in H(S) such that p*(E,;,0\ E) < © for 
all m and i # 1. 

From Remark 1 we have 


6*(E) = lim >) (i — 1)/2"u*(Exs OQ E) = lim DD (i — 1)/2"o(y). 
Now i=l Rac i=l 


u*(yEn; () yE). 


Let A(N,i) = {j: (¢ — 1)/2"e(y) < (jf — 1)/2% < 9/2” < 1/2"e(y)} for 
i= 1,..., m2". Note that j in A(N, i) implies Ey, C yE,, and that 


U jes N.@ Ex; Cc U jeac i) Ew; 
if NV < M. 
In Remarks 2 and 3 we shall be concerned with a particular i and fixed n 
and y: hence we shall suppress the 7 in the notation A(V). 


REMARK 2. 
vEqi = lim U jaw) Ey; U I 
Noa 
where I = {x: a(x) = 2"o(y)/(t — 1)}. 
Proof. From the definition of A we see that the right side is a subset of 
the left side. Let z be a member of the left side. Then 
o(2-') = (¢ — 1)(2"e(y))' 
or 
(i — 1)(2"o(y))“! < o(2-") < 2(2"e(y))". 


The first case implies z is in 7. For the second case there exists an M and 


j © A(M) such that 
(¢ — 1)/2"e(y) < (Gj — 1)2-™” < ao(2") < 7/2" < i/2"o(y). 


Therefore z is in the union over A (M) and U j.4:y) Ew, is an increasing sequence 
of sets; hence the remark follows. 


REMARK 3. Let a >O be arbitrary; then, for any E © H(S) such that 
u*(E,;,C\ E) < @, there exists an M such that 


u*(yEn: CO yE) < D> u*(Ews O yE) + w*(1 0 yE) + a2"n 


jeA’‘M) 


Proof. From Remark 2 we have 











372 MARK MAHOWALD 
u*(yEn: tr) yE) < u* (lim U jew) En; () yE) + u*(1 Cr) yE) 
Na 


= lim B*(U sea Ens 1 YE) + w* (1 1 yE) 


N-co 
< lim =< u* (Ey; 1) yE) + u*(1 1) yE). 
Now jeA(N) 
Since the left side is finite, there exists an M such that the remark holds. 
We can do this for each i obtaining an M,. If we are given y and fix nm such 
that o(y)m > 1 and if we let No = max{M,, logec(y) + n} then Remark 3 


holds uniformly in i for all NV > No. In addition, since 1/2"e(y) > 1/2%, there 


exists one distinct 7; for each i such that 


bo SF 
We then can prove 
REMARK 4. 
n2" 
> (i — 1)(2"o(y))u* (vEn: 1) YE) 
i=] 


N2N 


<b Gj — 12 u* (Ens ON yE) + YS 2% y* (Ens, Q yE) +4 
j=l i=1 
for all N > Nz. 


Proof. We shall call the left side of the inequality K,. Then, from Remark 3 
and the definition of A(N, 71), we have 


K.< > G- 1/2) | > w*(Ewy, 0 yE) + u*(1N xe) | 


i=] Jt AN, i) 


n2" 
+ = a(t — 1)/n'2"e(y) 
i=] 


A 
iMs 
M 


[Gj — 1)2"Ju*(En,; A yE) 


n2" 
, on =! A 
+d G— 1)(2"e(y)) u*(yE (\ I) +a. 
i=1 
Since there exists one j; for each i, we have for all VN > N, 
N2N 


K, < > (Gj — 1)2"u* (Ew; N YE) 


j=1 
+at+ Do [i — 1)Q"o(y))* — (j, — 1)2-* Ju* (Ex), A YE). 
i=1 


Since (¢ — 1)(2"e(y))-' — (fj; — 1)2-* < 2-", the remark follows. 


REMARK 5. The lemma is true if u*(E) < @. 


ON RELATIVELY INVARIANT MEASURES 373 


Proof. If u*(E) = 0 we are finished. Therefore we shall assume that 
0 < uw*(E) < @. Then from Remark 4 and the monotonicity of the outer 
measure, we have 

K, < a + &(yE) + (n2"/2”)y* (yE). 
Letting V — © we have K, < a + 6*(yE). This is true for all » from some 
point on; therefore, 6*(£) < a + 6*(yE). Since a@ is arbitrary we have 
6*(E) < 6*(yE). Applying this inequality to the set yE and y~' we conclude 
that 6*(£) > 6*(yE) and the remark follows. 

REMARK 6. Jf u*(E) = ©, then the lemma is true if there exists a K such 
that u*(yE,;C\ yE) < 2"K for all n and i # 1. 


Proof. Since 1/2" < 1/2"«(y), we have 


Ey ji <a VEn i U yE,, i—1- 
Then from Remark 4 we have 


n2" 


Ky < 8*(yE) + 2) 2™[u* (Ens 1 YE) + u* (VEn. 1-1 1) yE)] + @ 
i=] 
< 5*(yE) + 22"K n2"2-" +a. 
The remark now follows as in Remark 5. 


REMARK 7. If u*(E) = @© and there does not exist a K as in remark 6, then 
the lemma is true. 


Proof. Let K’ be given. Then there exists an m and ig # 1 such that 
u*(yEni C\ yE) = o(y)u*(Enig C\ E) > 2"K'o(y). 
This implies 
8*(E) > >> G — 1)2"u* (Eni OE) > KR’. 
1 
Hence 6*(E) = @. The result follows as in Remark 5. 
This completes the proof of the Lemma. The case which was excluded 
just before Remark 2, that is, E such that u*(E,;0\ E) = © for some n 
and 7 # 1, is clearly covered in Remark 7. 


REFERENCE 


1. P. R. Halmos, Measure theory (New York: Van Nostrand Co., Inc.). 


Xavier University 
Cincinnati, Ohio 








SUMS OF FUNCTIONS OF DIGITS 
B. M. STEWART 


1. Introduction. We generalize in several directions a paper by Porges 
(2) who considered the integer F(A) obtained from the positive integer 1 by 
taking the sum of the squares of the digits of A. Porges showed that if A > 99, 
then F(A) < A, so that under iteration of F(A) all the positive integers are 
divided into a finite number of classes, called orbits in the terminology of 
Isaacs (1), each containing a finite cycle. For his F(A) Porges showed there 
are only two orbits: one with the I-cycle: 1-1; and the other with the 
interesting 8-cycle: 4— 16 — 37 — 58 — 89 — 145 — 42 — 20 — 4. 

Consider the set Z of non-negative integers and choose as a base of enumera- 
tion any desired integer B = 2 (not necessarily B = 10). Then only the 
“digits” 0,1,2,...,B — 1 are needed, in suitable multiplicity, to represent 
any A of Z. Suppose there is given an arbitrary function assigning to each 
digit a the value P(a) in Z. (In Porges’ example the special function used is 
P(a) = a*.) Each A in Z has a unique representation to the base B, hence 
if F(A) is defined to be the sum of the values of P(a), summed over all the 
digits of A, then not only is F(A) well-defined, but also F(A) is an integer 
of Z, so F(F(A)) is meaningful and continued iteration is possible. 


More precisely, let a amd a; be restricted to the set 0,1,2,...,B— 1 
and let a,’ be restricted to the subset 1, 2,...,B — 1. Then any integer A 
in the range B* = A < B**', k > O, has a unique representation 

k—1 


A = ajB* + > a,B". 


0 


After P(a) has been given, we make the definitions 
k-1 
F(a) = P(a), F(A) = P(aj) + > P(a,), 


and thus obtain the type of function which suggested the title of this paper. 

We propose to study the growth of the function F(A) and to exhibit certain 
regularities in the behaviour of F(A) despite the arbitrariness of P(a). For 
example, it proves easy to demonstrate (Theorem 1) the existence of an integer 
C such that F(C) 2 Cand F(A) < A for every A > C. Then a more detailed 
analysis is presented, using an auxiliary constant S, to construct an algorithm 
(Theorem 2) for the evaluation of C. As an aid in finding the value of S, certain 
other constants J and L are introduced and they provide further interesting 
sidelights (Theorems 3, 4, 5) on the behaviour of F(A). 


Received January 2, 1957; in revised form January 5, 1960. 


374 


V 





SUMS OF FUNCTIONS OF DIGITS 375 


These general results are applied to the special case P(a) = a‘ with con- 
siderable effectiveness (Theorems 6, 7, 8). 

A preliminary study is made of the orbit- and cycle-numbers resulting from 
the iteration of F(A) and the finiteness of these numbers is assured. The 
teasing irregularities of these numbers are shown by selected tables. 

Finally, a brief section is presented concerning products of functions of 
digits. 


2. Existence of C. If proving the existence of C is the only concern, we 
may assume merely that P{z) is a complex function for which P(a) is defined 
for every a. Define F(A) as above. 


THEOREM 1. To any real « > 0 there corresponds an integer C = C(e) such 
that |F(C)| 2 eC and such that |F(A)| < «A for every A > C. 


Proof. Let P be the maximum value of |P(a)|. Since B*/(k + 1) is increasing 
and unbounded for k = 0,1,2,..., there exists K = K(e, P) such that 
BY/(k +1) > P/ewhenk > K. If B'S A < B**', then |F(A)| S (R+ 1)P 
< «B® < «A for all k > K. Also |F(0)| = 0. Hence C exists, 0 s C < B**'. 


In the sequel our intention to study iteration of F(A) leads us to insist 
that the values of P(a) be in Z and to avoid painful details we discuss only 
the case « = 1. As an aside, note that by the usual interpolation formula 
there exists a polynomial P;(x) with rational coefficients and degree at most 
B — 1 which will take on for the set {a} the prescribed values | P(a)}. How- 
ever, it may be convenient to use polynomials of degree higher than B — 1, 
but of simpler structure, as in the case P(a) = a‘ when ¢ 2 B. 


3. Algorithm for C. Let H(A) = F(A) — A and H,(a) = P(a) — aB* 
fori 20. If B® SA < B**', then for k > 0, 


k-1 
H(A) = H,(ai) + > HAia,). 


The properties defining C when « = 1 may now be restated: 
H(C) 2 0, H(A) <0 for every A > C. 


Let m, be the maximum value of a for which H,(a) is a maximum, and 
let m,’ be the maximum value of a’ for which H,(a’) is a maximum. Then in 
the range B* = A < B**' when k > 0, the maximum value U, of H(A) is 
given by U, = H(M,), where 


k-1 
M, = miB* + > m ,B' ; 


when k = 0, Up = H(my'). Define U_, = H(0). 
Define the integer S by the conditions Us 2 0 and U, < 0 for every k > S. 
The existence of S follows immediately from U_, = P(0) 2 0 and from 











376 B. M. STEWART 


Theorem 1, since H(A) <0 for A>C implies U, <0 for every k > K. 
1 


Hence — 1 S S & K. (In the next section we give much improved estimates 


of S.) These observations establish the following 
Lemma. Jf S= —1,C=0. /f S20, Bs Ms SC < B**, 
To determine the exact value of C when S is known and S = 0, consider 


S—1 
Us = Hs(m's) + >> H,(m,). 


Determine a maximum cs’ such that 


(1) Hs(c’s) + Us — Hs(m';) = 0. 
This selection is possible with B — 1 = cs’ = ms’ = 1, for at least the choice 
Cs’) = ms’ makes (1) hold, since Us = 0. 

Next (assuming S > 0) determine a maximum cs_; such that 


Haslcs 1) + H (c's) + Us — H ;(m's) — AH s_1(msg_1) ro 0. 


This choice is possible with B — 1 2 cs_; = ms_,, for at least the choice 
Cs~1 = Msg_, is valid, because of the previous step (1). 

Proceed recursively from i+ 1 to i, S > i = 0, choosing a maximum c¢,; 
such that 


(2) H¢;:) + HeilCun) +...+As(es) + Us 
— Hg(m's) — ... — Higil(mii) — Hi(m, = 0. 
This choice is possible with B — 1 = c; = m,, for at least c, = m, is a valid 


choice, because of the previous step in the algorithm. 


THEOREM 2. For S = 0, let 
S-1 
Q = csB* + D> cB". 
Then Q = C. 


Proof. When 1=0, the inequality (2) shows that H(Q)20. If 
Bk‘ < A < B*' and k > S, then H(A) Ss U, < 0, by the definitions of U, 
and S. If every digit of Q is B — 1, it follows that C = Q = BS+! — 1, 

Otherwise, suppose some digits of Q are less than B — 1. Then for each 


S-1 
A =a';B* + > a,B' 


in the range Q < A < B**', there must be an index i, S 2 i 2 0, such that 
either B—124as5' > cs’; or ds’ = cs’ and a;=c; when j >i, but 
B-1i2a4;>c. 

In the first case, because of the maximum property of H,;(m,), 


S-1 


H(A) = Hs(a's) + 2) Hi(m,) = Hs(a's) + Us — Hs(m's) < 0, 














SUMS OF FUNCTIONS OF DIGITS 377 


where the last strict inequality follows from as’ > cs’ and the maximum 
property of cs’ expressed in (1). 
In the second case, because of the maximum property of H,(m,), 


i—1 
H(A) S Hs(c's) +... + Huleuns) + Alay) + DS H,(m,) 


= H (a;) + A ni (€ 441) 4+ eee + H (c's) + Us = H s(m's) ™ .e6 
— H,(m,) <0, 

where the last strict inequality follows from a, > c,and the maximum property 
of c; expressed in (2). 

Since we have shown H(Q) 2 0 and H(A) < 0 for every A > Q, it follows 
that 0 = C. 

In the following example B = 4. The table shows P(a), H,(a) and U, with 
a double underline for H,(m,) and, if there is a distinction, a single underline 
for H,(m,‘). All entries are written in the usual way with base 10. 


TABLE |! 
EXAMPLE 1. B l 4 16 64 256 1024 
a P(a) 1 0 l 2 3 4 5 

0 100 H;(0) 100 100 100 100 100 100 
l 50 H;(1) 49 46 34 —l4 — 206 974 
2 200 H;{2) 198 192 168 72 —312 — 1848 
3 10 H;(3) 7 —2 —38 — 182 —758 — 3062 
U; 198 390 558 630 452 —216 


With the aid of the later Corollary 5.1, we may see from this table that 
S = 4. Then starting from M, = B4 + 2B? + 2B + 2, the algorithm of 


Theorem 2 is the following. Replacing m, = 1 by a = 2 gives H = 346, but 
by a = 3 gives H = — 100, hence c, = 2. Next, replacing m; = 0 by a 3 


gives H = 346 — 100 — 182 = 64, hence c; = 3. No further replacements are 
possible: co = me, C, = m1, Co = mo. Thus C = 2B* + 3B* + 2B? + 2B + 2. 


4. Growth properties of F(A). In this section we obtain further pro- 
perties of H(A) = F(A) — A and since our chief concern is what happens to 
H(A) as A increases, we describe these as growth properties of F(A). 

Let R be the maximum value of (P(a’) — P(0))/a’. 

If R < 1, define J = 0. If 1 S R, define J by BY! S R < BY’. 


THEOREM 3. Jf i 2 J, m, = 0. Ifi< J, m; = mj. 


Proof. Note that H,(0) — H,(a’) = P(O) — P(a’) + a’B'>0 holds if 
B‘ > R, hence for i 2 J. But when i < J, suppose R = (P(m') — P(0))/m’ 
and note that H,(0) — H,(m’) s 0. 











378 B. M. STEWART 


CorROLLARY 3.1. S => J — 1. 


Proof. lf J = 0, the statement S 2 — 1 is trivial. If J > 0 and i < J, 
then it follows from Theorem 3 that H,(m,’) = H,(0) = P(O) = 0. Hence 
for k < J, U, 2 0, therefore S = J — 1. 


COROLLARY 3.2. There exists an integer J, such that for i = J,, m,’ = 1; and 
if i < J,, then m; > 1. 


Proof. The proof exactly parallels that of Theorem 3, starting with R; as 
the maximum value of (P(a’) — P(1))/(@’ — 1) for all a’ > 1, and defining 
J, = 0, if R, < 1; but otherwise, defining J, by B’*-' s R, < B”'. 


Example 1 provides an illustration of these results wherein J = 3, J; = 4. 


CoROLLARY 3.3. The following relations hold: 


(3) Uuir- U; = H,.3(m'11), 0 < t< J: 
(4) Uier — Ui = Higs(m'as) + PO) — Hy(m'), J si. 
Proof. In the sums representing U;,,, and U,, the terms with index 7 S$ i — 1 


are the same, hence 
U4 21 U; _ A igi (m'41) + H ,(m;) —_ H (m',). 


When 0 S71 < J, the second part of Theorem 3 shows H,(m,) = H,(m/,‘) 
which establishes (3). When J S17, the first part of Theorem 3 shows 
H,(m,) = H,(0) = P(O) which establishes (4). 


COROLLARY 3.4. Let Jz be the maximum of J and J,. If i 2 Jo, then 
Ui — U, = P(O) — BtY(B — 1). 


V 


= J, relation (4) holds. From i 2 Jz 2 J;, Corollary 
F = 


Proof. From i 2 J2 
= 1, hence 


3.2 shows m,,1’ 
H ii(m'41) — Hi(m') = P(1) — B** — (P(1) — B), 
thus (4) reduces to the stated form. 
THEOREM 4. For i = 0, B‘(B — 1) S H(m/) — Huil(myy’) S B(B — 1)?. 
Proof. From the maximum property of H,(m,’) it follows that 
HA (m‘) — Higi(m'ia1) > Hilmizi) — Hea(m'ys) 
= m',,B'(B — 1) 2 B'(B — 1). 
From the maximum property of H,4;(m,,,’) it follows that 
H(m') — Hiil(m'4:) S Him’) — Hiil(m') = m'B'(B — 1) s BY(B — 1)’. 


, 


COROLLARY 4.1. m,.1' S m/. 








SUMS OF FUNCTIONS OF DIGITS 379 
Proof. In the displayed steps of the proof of Theorem 4, note that 


m'..,B'(B — 1) S Him’) — Higslm'as) S m’B‘(B — 1). 


Define L to be the minimum integer such that U,,; < Uy, and such that 
if J > 0, then L = J — 1; but if J = 0, then LZ = J. 

We appeal to Corollary 3.4, with 7 sufficiently large, to show that Z must 
exist. (The existence of L may be shown also by the existence of S and by 
Corollary 3.1, except for the case J = 0 and S = — 1.) 


THEOREM 5. Jf i 2 L, then Ui < U;. 


Proof. The proof is by induction on i with the case L serving as the base 
for the induction. When i 2 J + 1, it follows from (4) and Theorem 4 that 


Uiar — Us = Higilm'ys) + PO) — Hi(m') = PO) — B'(B — 1) 
< P(0) — BY*(B — 1)* Ss PO) + Ay(m') — Hya(m'-.) 
= U,— Uy. 
When J = 0 this completes the proof, sincei — 1 2 L 2 J implies i2J+1. 
When J > 0 the above argument is valid except for the one possibility 


i—-1=L=J-—1. But then using P(O) = H,;_,(0), the second part of 
Theorem 3, and (3), we may modify the last displayed line to read 


H,-,(0) + Hy(m',) — Hy1(m';_1) S Hy(m',) = Uy — Uses, 


which completes the proof. 
Coro.iary 5.1. If E 2 L and if Ug 2 0 but Ugy; < 0, then E = S. 


Proof. Theorem 5 shows U; S Ugi; < 0 for every k > E. Hence E = S. 
As an application of this corollary note in Example 1 that L = 3, U, > 0, 
Us; < 0, consequently S = 4. 


Coro.iary 5.2. If J > 0 andi < J —1, then Uy, 2 U,. 


Proof. lf Up < U_,, then P(m’) — m' < P(O) implies R < land J = 0. So 
the hypothesis J > 0 implies Up 2 U_;. Sincei < J — 1,4 +18 J — l,and 


R = (P(m’') — P(0))/m' 2 B’—' = B**' which implies P(m’) — m’B‘t' = P(0). 
Then for J — 1 > 7 2 O, relation (3) holds, so that 


Ui — Uy = Heilm'as) = Higs(m’) = P(m’) — m'B*' = P(O) = O. 


Corollary 5.2 indicates that when J > 0, the condition L 2 J-—1 is 
necessary if we are to have Uz; < U,. Thus the search for S, initiated in 
Corollary 3.1 and made explicit in Corollary 5.1, should begin at this point 
L2J—-—1. 

However, when J = 0, the added condition L 2 J plays a different role. 
For J = 0 implies R < 1, hence Up = P(my’) — my’ < P(O) = U_,, but this 
does not imply U,; < Uo as the following example shows. 











380 B. M. STEWART 








TABLE Il 

EXAMPLE 2. Bt 1 4 16 64 256 1024 
a P(a) 1 0 1 2 3 4 5 

0 100 H;(0) 100 100 100 100 100 100 

1 90 H,(1) 89 86 74 26 — 166 — 934 
2 80 H;(2) 78 72 48 — 48 — 432 — 1968 
3 70 H;(3) 67 58 22 — 122 — 698 — 3002 

U; 89 186 274 326 


In Example 2, J = 0 and Us, = 89 < 100 = U_;. However, L = 3 and 
} 4. Starting from M, = B*‘ we find by the algorithm of Theorem 2 that 


C = Bt + 3B* + 1. 


5. The case P(a) =a‘. If P(a) a‘ where ¢ is a fixed positive integer, 
there are two trivial cases. If ¢ = 1, it is obvious that C = B — 1 for every 
B. If B = 2, it is obvious that C = 1 for every t. 


THEOREM 6. Jf P(a) =a',t>1, B>2, thn0<J=sS Stand S may 
be determined by: BS = J(B — 1)‘ — (BY — 2) < BS*, 


Proof. Since P(O) = 0, R = (B —1)*"', so that J is determined by 
B/— < (B — 1)*' < BY’. The condition ¢ > 1 implies J > 0. Moreover, 
(B — 1)*'! < B*', hence J Si — 1. 

Since t > 1, H;(x) = x‘ — xB‘ is concave upward for x > 0. Hence m,’ is 
either 1 or B — 1. Note that 


H,(i) — H,(B — 1) = 1+ (B — 2)Bt — (B - 1)¢. 
When i S J —1, 
(B — 2)B‘ < (B — 1)B7"' < (B - 1)*, 
so that m, = B —1. Wheni=2J+1, 


(B — 2)Bt = (B — 2)B’+' > (B — 2)(B — 1)' 2 (B - 1), 


som,’ = 1. From Theorem 3, it follows that m; = m,’ = B — lfori Ss J — 1, 
and m, = 0 for i 2 J. The only undecided case is m,’ which is either 1 or 
B—1. 

From the preceding results 


got Jnl 

\Usia= )) HA(B-1) = > ((B—- 1)‘ — (B— 1)B‘) 

(5) = J(B — 1)‘ — (BY - 1); 
U, = H,(m';) + Uys 2 Ash) + Us: 

Hl) + U,-1, for s>-J. 


~ 
y 
II 





SUMS OF FUNCTIONS OF DIGITS 381 


Using (5) and (B — 1)*' = B’-' + 1, we may show U, 2 0 as follows: 


U, 2 H,(1) + Us = J(B — 1)‘ — 2(BY — 1) 
= J(B — 1)(B’—' + 1) — 2(BY — 1) = BY"'((J — 2)B — J) 
+ J(B — 1) +2. 
If J = 1 or 2 the last expression is 0. If J 2 3, the last expression is positive, 


for B > 2 implies (J — 2)B 2 J. Hence U, 2 0, so S 2 J (a bit more than 
Corollary 3.1). 

If: >t, then B‘ = B‘t'; and also from J S t — 1, we have i > J + 1. We 
combine these observations with (5) to see that if i > ¢, then 


U, 1— B'+ J(B — 1)‘ — (BY — 1) 


ll 


< (¢ — 1)(B — 1)‘ — B™' < (B — 1)' + @ + 1)(B — 1)‘ — B™ 
< ((B — 1)%' + (4+ 1)(6 -— 1)'+...4+1) — BY 
= (B—1+1)*' — B' = 0. 


Since U; < 0 for 7 > t, it follows that S S t. 

In the proof that S 2 J we showed that H,(1) + U,_, 2 0 which implies 
BY = 1+ U,_,. From (5) we have U; = 1 + Uy_, — B‘ when i > J, hence 
we see that S (with Us 2 0 and U, < 0 for all k > S) is determined by 


BS 31+ Uy; < BS", 
This result together with (5) completes the proof of Theorem 6. 


In general, to find C we must next apply the algorithm of Theorem 2. 
However, in many cases we can say considerably more, as the following 
theorem indicates. 


THEOREM 7. Jf P(a)=a', t>1, B>2, then C<(t-—1)B'. If 
B>T= 1— (1 —f')")~-! (which includes all B = t*) then 


C = (¢ — 1)B‘ — 1. 


Proof. From Theorem 6, S S t, hence C < B‘t'. Suppose that C < (¢—1)B‘ 
is false. Then B‘t! > C 2 (t — 1)B‘ implies B 2 ¢ and 


t—1 
C=c.B'+ > c,B' 
with B — 1 =c,/ = ¢ — 1. But then it follows that 
C2c(B—1+1)'+¢:B." 
c((B — 1)' + 4(B — 1)°" +... +1) + ,-.B 
> 7 (B — 1)' + c4(B — 1)*"' + ¢,,B"" 











382 B. M. STEWART 


> (¢ — 1)(B — 1)‘ + (c)' + (ee1)' 


IV 


t-1 
(c)'+ S (c)' = F(C). 


The inequality C > F(C) is a contradiction of one of the defining properties 
of C. Therefore C < (¢ — 1)B‘ is true, as stated in the first part of Theorem 7. 

It is natural to ask for B 2 t whether Q = (¢ — 1)B‘ — 1 will serve as C. 
Since F(Q) = (¢ — 2)'+ ¢4(B — 1)‘, the inequality F(Q) 2 Q will hold if 


t(B — 1)‘ 2 (¢ — 1)B". This is readily brought to the form 
B>T=(1- 0 —¢r°)"5-1, 


Since (1 — &*)'>1-—¢' > (1 —¢")‘, it follows that # > T >t. These 
observations complete the proof of Theorem 7. 


In the remaining cases the method of Theorem 2 is available for finding C. 
At least one general observation can be made about the result. 


THEOREM 8. For P(a) = a‘,t > 1, B > 2, C has the property thatc, = B — | 
fori < J; and ettherc; => B-—lorc,; St-—2forJ SiS. 


Proof. Recall from the proof of Theorem 6 that m, = m,’ = B — 1 for 
i = J — 1. Since c; 2 m, it follows that c,; = B — 1 fori SJ — 1. 

The rest of the theorem is trivial if B < t, and is known from Theorem 7 
if B > T. In what follows assume B > t. 

If S = #, it follows from C < (¢ — 1)B‘, that c, S t — 2. Since S Sf, it 
remains to discuss c; for the cases J S 1 S S where i < t. 

Since 1 < ¢t < B, note that 


t t t—1 
ere oe = > (B-1)'(t-—1)°"" 


t—1 
> (B- (‘5 ‘) 


IV 


II 


(B— 1+ 1)" = B*' > B*. 


Hence H,(B— 1) = (B — 1)‘ — (B — 1)B‘ = (t — 1)‘ — (t — 1)Bi = 
H,(t — 1). Because of the concave upward property of H,(x) the inequality 
H(B— 1) 2 H,(t— 1) indicates that the choice of c; in the range 
t— 1S ¢,< B-—1 would be a contradiction of the requirement in the 
algorithm of Theorem 2 that c; be maximal satisfying (1) or (2). Consequently 
c;, must be limited to the values stated in the theorem. 

The following tables illustrate Theorems 6, 7, 8 by showing C for P(a) = a! 
for all B = 3 when ¢ = 2, 3, 4, 5. 


SUMS OF FUNCTIONS OF DIGITS 383 


TABLE III TABLE V TABLE VI 
t=2 t=4 t=5 
B C B "4 B c 
B23 B? — 1 3 B? —] 3 Bt — 1 
—_—— 4,5 Bé—1 4 Bs —] 
6 Bt + B?-1 5 BS + Bt— 
TABLE IV 7toll 2B¢-1 6,7,8 2B'—1 
- 12,13 2B4¢+B2-1 9 2B + Bt —1 
t=3 14 2B‘ + 3B? — 1 10to19 3B*—1 
B215 3B¢—-1 20 3B* + Bt — 
B ¢ _ —__— — 21 3B* + 2B4 — 
22 3B* + 3B4 — 1 
3 2B? — 1 B223 4B*-—1 
{ Bi — | — 


. 
7 B? + B?—1 
=8 2B-1 


The effectiveness of the algorithm for finding C may be illustrated by an 
example such as B = 10, ¢ = 100. The necessary comparisons are in this case 
successfully made with a table of logarithms. 


Test Decision 
(1) 107" < 9* < 107 J = 95 
(2) 10° S Ugg + 1 = 95-9! — 10% + 2 < 105+! S = 97 
(Remember from (5) that Us; = Ho7(1) + Us,.) 
(3) cc! — ¢-10% + Ugg 2 0 Cor = 2 
(4) cl — c-190% + 21° — 2-10 + Uy 20 Cos = 5 
(5) cl? — ¢-10% + §'° — 5-109 4+ 21° — 2-109 + Uy = 0 Cos = 1 


Theorem 8 guarantees c; = 9 for 0 S i < J, so the algorithm closes, and 


C = 2-B% + 5-B% + B% + (B% — 1). 


6. Orbits of F-related integers. Return now to the general function 
P(a) requiring only that P(a) is a non-negative integer. This modest restriction 
not only allows the number C to be determined as in Theorem 2, but also 
allows the function F(A) to be iterated. 

Define F(A) = A and F“+)(A) = F(F(A)). Integers X and Y are 
said to be F-related if and only if there exist non-negative integers k and m 
such that F(X) = F™(Y). Being F-related is an equivalence relation 
dividing all non-negative integers into NV disjoint sets of F-related integers. 
Following Isaacs (1) call each such set an orbit and denote the orbit con- 
taining A by {A}. 


THEOREM 9. For F(A) the number N is finite. 











384 B. M. STEWART 


Proof. The existence of C implies that each orbit {A} contains at least one 
integer K with K S C, for otherwise the sequence F(A) for nm = 0,1, 2,... 
(all of whose members belong to {A}) would be an infinite decreasing sequence 
of non-negative integers. The existence of such a K for each orbit {A} shows 
that 1S NSC+1. 


CoROLI.ARY 9.1. At least one orbit must be infinite. 


An improved estimate of the value of V may be obtained by noting that 
the value of F(A) does not depend on the order of the digits of A. For if A, 
is obtained from A merely by permuting the digits (but keeping a,’ > 0, of 
course), then F(A,) = F(A). Consequently many numbers less than C are 
apt to be F-related. 

Let C* be the number of integers A, 1 S A S C, which can be written 


IV 
IV 
IV 


a, ay = 0. 


k—1 
A=aB'+)> 4B, B-lzazaus 
0 
Then an improved estimate for V is given by | S V S C* + 1. 
From C < B+! and properties of the binomial coefficients it follows that 


o's e 7 7? ') (S + 2). 
S+1 
The work of Isaacs shows for the iteration of a much more general function 
G, that each orbit of G-related numbers has at most one ‘‘cycle’’ and various 
incoming ‘‘branches.”’ The word ‘‘cycle’’ has the usual meaning—namely, for 
F(A) it will mean the existence of a period number p (minimal and positive) 
and an initial point g such that 


F4+)(4) = F(A) for all i 2 ¢. 


If F™(X) = Y, m 2 1, then X is called an ‘‘antecedent” of Y. If m = 1, 
X is an “immediate antecedent” of Y. li X # Y, X isa “‘proper antecedent” 
of Y. If F(X) = U is in the cycle part of {A}, but X itself is not in the cycle, 


then X and all its antecedents constitute a “branch” of {A}. 
THEOREM 10. For F(A) each orbit {A} has a unique cycle. 


Proof. If the orbit {A} is non-cyclic, then for all m sufficiently large 
F® (A) > C; however, for such n, F+?(A) < F™(A) and a contradiction 
is reached, for we cannot have an infinite decreasing sequence of integers 
> C. Thus each orbit {A} must have a finite cycle. 

To show that this cycle depends on {A} and not on the representative A, 
we reproduce Isaacs’ proof. Suppose U and U’ are both in {A} and that each 
is a member of some cycle of {A}. The first hypothesis implies the existence 
of k and m so that F™(U) = F™(U") = U”. The second hypothesis now 
shows that U” is in the cycle containing U and also in the cycle containing 
U’. In other words, {A} has only one cycle. 























SUMS OF FUNCTIONS OF DIGITS 385 


CoroOLuary 10.1. Let W be the maximum value of F(A) for A S C. Then 
the period p of the cycle of {A} is bounded by 1 Spsw-+l. 


Proof. In the proof of Theorem 9 we showed that {A} contains at least one 
K with K s C. Then F™(K) Ss Wforallm 2 0. For either C < F™(K) s W, 
whence F"+)(K) < F™(K) by the definition of C, thus F“+(K) < W; or 
0 Ss F™(K) S C, whence F""*?(K) S W by the definition of W. Not only 
is the existence of a cycle of {A} newly evident, but also the maximum number 
of elements in the cycle is the complete setO0 S X S W, hence l SpsW+l. 


COROLLARY 10.2. Each element U of the cycle part of {A} has the property 
U s W and at least one member U satisfies U S C. 


A simple example in which the maximums of both NV and are attained is 
given by B = 2, P(O) = 1, P(1) = 0, wherein C = 0, W = 1, and there is 
just one orbit: V = 1 = C+1, with p=2=W+1. 

There seem to be few additional general statements to be made about the 
orbits, cycles, and branches, for by varying P(a) properly, we may construct 
bizarre situations which contradict proposed generalizations. 


REMARK 1. Not every orbit need be infinite. For if P(g) = g, but P(a) > ¢ 
when a ¥ gq, then {gq} contains only gq. 


REMARK 2. Jf P(a) = 1 for some a # 0, then every Y > 1 has a proper 
antecedent. For F(A) = Y has a solution 


and A > Y. 


REMARK 3. Jf P(a) = 0 for some a, and if A # 0, then F(A) has infinitely 
many immediate antecedents. For since P(a) = 0, 


m—1 
A, = AB" +a >, B' 


has F(A,,) = F(A), form = 1,2.... And since A # 0, the A,, are distinct 
(even if a = 0). 


REMARK 4. Jf P(a) > 0 for all a, then each Y has at most a finite number 
of immediate antecedents. For note that if x, denotes the number of digits 
of A which are equal to a, then F(A) may be written 


B-1 


F(A) = >> x,P(q). 


0 


Then the assumption P(a) > 0 for every a and the restrictions x, 2 0 mean 


that F(A) = Y is a linear Diophantine form problem with at most a finite 








386 B. M. STEWART 


number of solutions: xo, x;, ... , Xs—1. (Of course, there may be no solution.) 
Corresponding to each such solution set there are only a finite number of 


integers A resulting from permissible permutations of the sets of digits. (Per 


missible means at least one x, > 0 and a,’ > 0 where 


Example 3. epee 2 B = 10 and P(O) = P(2) = P(4) = 18, P(6) = 8, 
P(8) = 6, P(1) = P(3) (5) = 5, P(7) = 9, P(9) = 7. It is easy to find 
J=0,S=1, M= 20, C = 27, W = 36. Then {1,3, 5} is a finite orbit 
with p = 1; and {6, 8} and {7, 9} are finite orbits each with p = 2. All other 
integers belong to either pup? = or {27}, all of which are infinite orbits, 
each with p = 1. Hence N = 6. These results follow from Corollary 10.2 and 
Remark 4. 


Prcaprgeaay 4. Suppose ~ = qg for every a. If OS gq < B/2, then V = 1 
ith p =1 and F(C) = =g. If B/Ssq< BS+! (S + 2), S21, then 
= | ot b = 1 and neh = C= (S + lg. If BS/(S+ 1) Sq < B/S, 
21, then N = 2 and both orbits have p = 1: one (infinite) contains 
"(C) = C = (S + 1)q, the other (finite) contains F(U) = U = Sq. 


8. Orbits for P(a) = a‘. When the previous discussion is applied to the 
case P(a) = a‘, a few additional comments may be made. 

Remark 1 applies with g = 0. Hence {0} contains only 0. 

Remark 2 applies with a = 1. Hence if Y > 1, Y has a proper antecedent. 
Because P(0) = 0, F(B‘) = 1, so Y = 1 also has a proper antecedent. Note 
that the orbit {1} has p = 

Remark 3 applies. Hence each A #0 has infinitely many immediate 
antecedents. 

Let N, indicate the number of orbits of F-related integers with period i. 
Then N = }ON,. 

For? = landany B, N = N, = B. For C = B — Limplies V S$ C+1=B 
and P(a) = a shows each {a} has p= 1. Note that the corresponding 
F(A) = a, is the function met in arithmetic in the process called ‘“‘casting- 
out (B — 1)’s” and has the useful property F(A) = A mod B — 1. 

For B = 2 and any t, N = N, = 2. For C = 1 shows N S 2 and each of 
{0} and {1} has p = 1. By the same argument NV, 2 2 for every ¢ and every B. 

If ¢ = 2 and B is odd, then JN, is even and N, 2 4. From Section 5, when 
t = 2, C = B* — 1, and hence by Corollary 10.2 each l-cycle must contain 
either U = 6 or U=aB-+ 6. If F(b) = 5b? = b, then 6 = O or 1, the cases 
noted in the previous paragraph. If F(aB + 6) = a* + 6? = aB + 8, then it 
follows that 


F((B — a)B + db) = (B — a)? + 3? = B? — 2aB + (cB+ 5b) = (B—a)B+ob. 





~o-r--- ‘+! e-— —— .. 


fe 


he 


~~ oo Oa 


SUMS OF FUNCTIONS OF DIGITS 387 
Also B — a # a, because B is odd. Hence 1-cycles of this type occur in pairs, 
thus NV, is even. Furthermore, at least one choice of a and 6 is always avail- 
able: a = 6 = (B + 1)/2. Hence N, 2 4. 

Perhaps the best way to show the teasing irregularity of the orbit and cycle 
numbers of F-related integers when P(a) = a‘ is to append the following 
brief tables. 


TABLE VII TABLE VIII 
t=2 t=3 
B NiN: N; Others N B N,N: N; Others N 
3 4 1 5 3 3 N,=1 4 
42 2 4 10 10 
5 4 1 5 5 4 1 5 
6 2 N,=1 3 6 5 N; 1 6 
7 6 N,=2 S 7 8 42N,=N,=1 16 
8 42 1 7 8 7 N,=1 s 
8 2 § 6 yg 9 2 N, = Ny = 1 13 
10 2 N, = 1 3 10 6 2 32 10 
ll 4 2 6 - 
12 42 1Ny=1 8 
13 8 3 ll 
142 1 N, = 1 4 
5 413N,2N=12N; 11 
16 2 Ne = 1 3 


Added in proof: 

During 1959-60, as part of an NSF Undergraduate Research Project, 
Joseph C. Ferrar made use of the Michigan State University MISTIC to 
check and extend Tables VII and VIII. Thanks to this work several correc- 
tions have been made in Table VII. The extended tables for t = 2 show B 
from 17 to 32 and for t = 3 show B from 11 to 16. 


Space allows explanation of just one of these entries. 


When B = 10 and ¢ = 3, then C = 1999. From the discussion following 
Corollary 9.1, there are 


numbers from 0 to 929 and 


165 


, Aen, 

2 

niet” 
Il 


numbers from 1111 to 1999 which need to be considered. The results are as 
follows: 











388 B. M. STEWART 


N, = 6: the 1l-cycles being 0; 1; 153; 370; 371; 407; 
Nz = 2: the 2-cycles being 136, 244, and 919, 1459; 
N; = 2: the 3-cycles being 55, 250, 133, and 160, 217, 352. 


Then by Corollary 10.2 each non-negative integer is a member of one and 
only one of these V = 10 orbits. 


9. Products of functions of digits. Use the previous notation for a, a’ 
and A and suppose P(a) is a rational integer P(a) 2 0. Define 
k-1 


G(a) = P(a), G(A) = Pia) [] Pe). 


The question suggested by Theorem 1 (for « = 1) is whether there exists 
an integer D for which G(D) 2 D and G(A) < A for every A > D. 

Let M indicate the maximum value of P(a) and let M’ indicate the maximum 
value of P(a’). 


Case 1. If M’ 2 B, then D does not exist. 
Proof. If P(b’) = M’, then 


k 
A=b' > B' 
has G(A) = (M’)**' 2 B**' > A for every k. 
Case 2. If M’ = 0, then D = 0. 


IV 
= 


Proof. \f A > 0, P(a,) = 0, so G(A) = 0 < A. And G(O) = P(O) 2 
Case 3. If 0 < M’ < Band M 2 B +1, then D does not exist. 


Proof. The hypotheses imply P(0) = M. Then A = }’B* hasG(A) = M’M* 
= (B + 1)* > B*' > A, for all k sufficiently large. 


Case 4. If M < B, then D exists. 


Proof. Note M’ s M. lf B's A < B**' and if k = (B — 1)(B — 2) =k; 
then G(A) = M’M* s (B — 1)**' < B* S A. For from the assumption 
k2=k, it follows that B-—1251+8#/(B —1) < (1+ 1/(B -1))¥ = 
(B/(B — 1))*. Since G(O) 2 0, D exists and is in the range 0 S D < B*. 


Case 5. If M’ < B, if M = B, and if P(a’) 2 a’ for any a’, then D does not 
extst. 


Proof. The hypotheses imply P(0) = B. Hence if A = a’B*, then 
G(A) = P(a’)B* 2 a’B* = A, for every k. 


Case 6. If M Ss B and P(a’) < a’ for every a’, then D = 0. 





ists 


um 


"M* 








SUMS OF FUNCTIONS OF DIGITS 389 
Proof. lf B®’ = A < B**', then 


k-1 
G(A) = P(aj) T] Pla) <aB*s A 


for every k. Since G(0) 2 0, D = 0. 


Since these six cases exhaust the possible situations, the only “interesting” 
cases (having D > 0) arise when 1 S M’ S M < B and P(a’) 2 a’ for at 
least one a’. For these cases the actual value of D and the orbits of G-related 
integers and their cycles may be determined by methods similar to those 
in §§ 3 and 6. 

In particular, the choice P(a) = a‘ leads to an “‘interesting’’ case only 
when ¢ = 1, and then there are B — 1 orbits, each infinite and of period 1. 


The author thanks the referee for his stimulating criticisms and suggestions. 


REFERENCES 


1. R. Isaacs, Iterates of fractional order, Can. J. Math., 2 (1950), 409-416. 
2. A. Porges, A set of eight numbers, Amer. Math. Monthiy, 52 (1945), 379-382. 


Michigan State University 











A LINEAR DIOPHANTINE PROBLEM 
S. M. JOHNSON 


1. Introduction. Let a;, a2, ..., a; be a set of groupwise relatively prime 
positive integers. Several authors, (2; 3; 5; 6), have determined bounds for 


the function F(a;,...,a@,) defined by the property that the equation 

(1) M= 41%, + GexX2+...+ aK, 

has a solution in positive integers x;,...,x, for n > F(a,,...,a,). If 
F(a,,...,4@,) is a function of this type, it is easy to see that 

(2) G(a;,...,@,) = F(a,,...,@,) —a@; —ad2—...—a, 


is the corresponding function for the solvability of (1) in non-negative x’s. 

It is well known that a;az2 is the best bound for F(a;, a2) and aya2 — a, — ay 
for G(a;, a2). Otherwise only in very special cases have the best bounds been 
found, even for t = 3. 

In the present paper a symmetric expression is developed for the best bound 
for F(a, a2, a3) which solves that problem and gives insight on the general 
problem for larger values of ¢. In addition, some relations are developed which 
may be of interest in themselves. 


2. A General Property. For ¢ > 2, let B(a;, a:,..., a,) be the best bound 
for F(a;, d2,...,a@,), that is, B is the maximum number N where 
t 
(3) N # 1 x A; for any x; > 0. 


i=l 


Then note that B is the maximum WN from a restricted set of numbers V 
satisfying both (3) and 


(4) N+a,= > yisy Yi, > O for each i. 
j=l 


since the definition of B implies B satisfies (4). Thus, in particular, 
N = (Wu — Idi + Yiede +... + Vise Viz > O. 
But by (3), yu — 1 < Oso that y;; = 1 since y,, > 0. By symmetry we have 


THEOREM 1. For every N satisfying (3) and (4) there are representations of N 
for each i = 1,2,...,¢t of the form 


t 
(5) N= D0 v1» yy >, 
=) 
‘iat 
and B is the maximum such N. 
Received October 21, 1957; in revised form March 9, 1959. 


390 








ime 


- ds 
een 


und 


eral 


lich 


und 


* 
= 


ave 


f N 





A LINEAR DIOPHANTINE PROBLEM 391 


3. The Case ¢ = 3. A reduction formula. We seek an expression for 
B = B(a,, a2, a3) having the property that (1) is satisfied for n > B but is not 
satisfied for n = B. Let us first reduce the problem to the case of pairwise 
relatively prime a’s. 

Let di, = (@;, a5), a, = bddi dy, so that (bd, be) = (be, bs) = (b3, b;) = 1. 
Then we have 


THEOREM 2. 
(6) B(a,, de, a3) = dod o3d3,B (6, be, b;). 

Proof. First we show that if we write d = dy2, 6; = disb1, bs = dosbe so that 
(d,a;) = (b;, 62) = 1, then 
(7) B(db,, dbe, a3) = dB (by, be, a3). 
Suppose that dB(b,, be, a3) = dbyx + dby + a;3z, x, y, z > 0. Then since (d, 
a;) = 1, we must have z = wd, w> 0, so that B(b,, be, a;) = bx + bey 


a;w, x, y, w > 0, a contradiction to the definition of B(b,, be, a3). In addition, 
for any positive integer m > 0, we show that 


(8) dB(b,, be, a3) + m = dbyx + dby + azz, x,y,2> 0. 
We apply a result from (2). 


LemMaA 1 (Brauer). Let a and b be relatively prime positive integers. Then every 
positive integer m divisible neither by a nor by b is representable either in the form 


(9) m = au + bv, u>0,v> 0, 
or 
(10) m = ab — au — bv, b>u>Od,a>v>QO0O. 


Letting d = a and a; = } in Lemma 1, if (9) holds, we have 
(11) dB(b,, be, a3) + m = d(B(by, be, a3) + u) + va; 
dbix + dbyy + a3(dz + ») 


by the definition of B(b,, be, as), giving (8). 

If (10) holds, we have 0 < u < a3, and 0 < v < d, so that 
(12) d(B(b,, be, a3) + a3 — u) — va3 = dbx + dbey + (dz — v)a;, 
for x, y, and (dz — v) > 0, giving (8). 

Finally, if m = ud, then (8) follows directly. If m = va3, write m = da, 
+ (v — d)a; giving (8). Thus (7) holds. Applying the method of obtaining 
(7) twice more gives (6) and Theorem 2. 

We have thus reduced the problem to where the a’s are pairwise relatively 
prime. For the moment let a; > a2 > as. If 
(13) a; = uaz + 043, u,v > 0, 
then B(a, a2, a3) = asa3 + a; as Brauer showed in (2). Otherwise 


(14) B (ay, @2, @3) < a@y + ay. 











392 S. M. JOHNSON 


4. An expression for B(a;, a2, a;). We develop a symmetric expression for 
B(a,, @2, a3) for the case of pairwise relatively prime a’s where each a; # xa, 
+ ya,, x > 0, y > 0. Later we show that this same form of expression gives 
the general solution for ¢ = 3. 


DEFINITION. Let L; = the minimum positive K, satisfying 
(15) K @; = 014; + viedk, v1; > 0, ve > 0, ¢ = 1, 2,3. 
Such a number exists since B(a;, a,) = aja, < Ka; for large K. 


THEOREM 3. Given 


(16) (41, @2) = (do, a3) = (a3, a;) = 1 

and 

(17) L;> 1, += 1,2,3 
and 

(15’) Lay = Xj + Xue, 


then the x;; are uniquely defined and 
(18) Xiy > 0. 
Since L, > 1, it follows from (10) and (16) that 
(19) A, = Ay — Vids — Vindk 
where 0 < v4, < a,,0 < vm < a;. Thus vga, + a; = (Qy — 04;)a; > La; and 
so by symmetry 
(20) Ly < a, for each j # k. 


If x,, = 0, then La; = xy,a, and by (16) L; = ma,, a contradiction to 
(20). This gives (18). Also the x;,,; are uniquely determined since if La; = x;,a; 
+X uy = 214; +2,0,, then by (16) we have xy = 2y-+-ma, and x4 = 24%,.—ma;. 
If m > 0, x4; > ay. But then for some d > 0, Lia; = (a + dja; + xa, and 
by (19) we get (ZL; — l)a; = (d + 04;)a; + (Xu + ¥u%)ay, contradicting the 
definition of L;. Similarly, for m < 0. 

For t = 3 and (16) and (17) we show that there are just two numbers V 
with properties (3) and (4) so that B is the larger of these numbers. From 
(5) such a number N has representations of the form 


(5’) N = yi ft, + Vuk ¢=1,2,3. 
Next observe that from (18) we have 

(21) Vi SL; 

since otherwise for some d; > 0 we would have N = (L;,+ d;)a; + yude 


= X50; + daa; + (xX + Yu)dx, contradicting (3). From (20) and (21) we 
have 


(22) Yes < dp, Vey < ay. 








for 
XA; 


the 


s N 


rom 


‘ww 
ad 


) ixQe 
we 





A LINEAR DIOPHANTINE PROBLEM 393 


Next we show that the representations (5’) for NV are unique for each i. 
For otherwise yp; + Yes = Se; + 2%, and from (16) and (22), y%., — 2%, 
= ma;,m <0, and Yr; — 2, = ma, m > 0, so that m = 0 and yyy = %,, etc. 

From (5’) and Theorem 1 we now have unique representations of N of the 
form 


N = Ves + Vey = Vifdy + Vude = Vue + Vii. 


If vey = ¥iy, then yey = may, contradicting (22). Thus either y., < y,, or 


Yuy > Vis 


Case 1. If 
(23) Yes S Vey 
then yrs = (Vip — Yess + Vad SO that yy, > Ly. Thus by (21) we have 
(24) Ver = Ly. 
Then by (24) and (5’) 
N = Lag t+ Vey = Vii + Vue 


or (Li — V5i)@s + Vagty = Vude, where L, > y,, by (21). If Li = yy, then 
Vey = ma,, contradicting (22), so that L; — y,,>0 and yy > Ly by the 
definition of Z,. But then yy, = ZL, by (21). Thus (23) implies that y,, = Ly, 
¥en = Ly, and cyclically, y;; = L,; But then by (15’) 


N = (x45 + YeyQy + Xad, = Ley + Vadr 


and by the uniqueness of these representations and by cyclic permutation of 
subscripts, we have 


(25) Vu = Xue 
and 
(26) Ly = X45 + Xe). 


Thus if v5; < ys, we get a unique number N where 
(27) N = La; t+ xq; 
with cyclic permutations of subscripts. 

Case 2. If 
(28) Vey > Viys 
we get another number where by symmetry 
(29) N’ = La; + Xp 


with cyclic permutations of subscripts. VN # N’ since otherwise xa, = x, 
which implies x > a,;, which by (25) contradicts (22). Note that these two 











394 S. M. JOHNSOH 


numbers are the only numbers with properties (3) and (4) for (16), (17), 
and t = 3. Since B is the largest number with property (3), it satisfies (4) so 
that B is the maximum of N and N’ and we have 


THEOREM 4. Given (16) and (17), then for cyclic permutations of subscripts 
(30) B(ai, a2, a3) = La; + max (xp), X pdx) 
and (26) holds. 

Also it is easy to verify that C, the corresponding best bound for G(a;, a2, as), 


satisfies 
(31) C(a,, G2, @3) + a; + de + a3 = B(a4, Go, a3). 


5. A computing algorithm for L, and x,,. Thus we have shown that 
finding B is equivalent to finding the set of positive integers L, and x,, ex- 
hibited in the form of a matrix of detached coefficients of the three equations 
(15’) as follows: 


1 ade a3 
—L; X12 X13 
X21 — 1.2 X23 
X31 X32 —L; 








In order to develop a simple computing algorithm for these numbers, we 
need the following result. 


LEMMA 2. Given (a1, @2) = (d@2, a3) = (a3, @1) = 1, then any system of integers 
K,>1 and v,,;>0 (mot necessarily L; and x;;) satisfying (15) and (26) 
K, = 054 + ei, implies that 
(32) KK; ~— Visi = V0, xj + Vek y = Aa; > ay 
for some positive integer . 

If we write 

Vx(K @; — 04 4;) = Ve pdk = Vu (Ka; — 0;,), 
then 
(v4 K ; + V x0 5i)Q; = (vaK; + U jx0 45) 0; 
and (32) follows by (16) and (26). 
Furthermore, we have 


THEOREM 5. If (16) and (17) hold, then the L; and x;; in Theorem 4 are 
characterized" by the equations (15') and (26), and 


(33) Lily + XK iX 53 = dy, 


for cyclic permutations of subscripts. That is, } = 1 in (32). 








ts 


26) 


are 





A LINEAR DIOPHANTINE PROBLEM 395 


Proof. Suppose a system of K, and »,, satisfy (15), (26), and (33) where 
at least one K, > L,, the minimum positive integer satisfying (15). 


Case 1. If K; = Li, Ke = Le, then K; = L; by (26) and Theorem 3. 
Case 2. Suppose K, = L;, but Kz > Ls, Ks > Ls. 


Then x12 = 02 and x;3 = v3 by Theorem 3 and by (15), (26), and (33) 
a, = K2K3 — vsWe3 = K2Ky — (Keo — X12)(Ka — X13) = XK + 24y:K2 — 
Xy2X13 > Xyols + Xygle — X42%13 = Lely — X32%23 > a, by (32), a contradiction 
to the assumption that Ky > Le, K; > L;. 


Case3.1f K, > Li, L2 > Ke, Kz > Ls, then first observe that either v,, > x,, 
or Ve > X«, but not both. For suppose v,, > x,, and vq > xe. By (33) 
Vifat Ke =a, Xin t+ Lae by (32). Thus vg <x. Similarly 
VipKe + Vues = A, < Xisle + XX SO that %, < x, But then a, < Ly 
— XypXpy < Kj Ky — var; = a4, a contradiction. 

In addition either v,; > xj; or 0%; > X,, but not both. For suppose v,, > x); 
and %, > x,, By the previous remark vy < Xj, %; < X,,;, leading to the 
same contradiction obtained above. Thus either 242, v23, 031, OF Ve1, V2, Vis are 
larger than the corresponding x's. That is v,, > x,,; for cyclic permutations 
of subscripts. 

Suppose 21, 32, 013 are larger than x2), X32, X13 respectively. Then by (26) 

(Ko — L2)de + (x23 — V23)@3 = (021 — X21)a1 > Lids 
by the definition of Z,. Thus v2; > L, and by cyclic permutation of subscripts 
V3. > La, 13 > Ls. 

Finally a3 < L,L2 — Xy2X%e) < LiL: < VoV32 < VoiV32 + Kw; = G43, a COon- 
tradiction. 

Thus A = 1 in (32) implies that K, = Li, v4, = xy. 

Conversely, 4 = 1 in (32), for K, = Ly, v4; = xy, etc. By the following 
computing algorithm we can always find sets of K, and v,, with A = | in 
(32). Thus they are the desired L, and x,,;. Moreover since the x,, are unique 
by Theorem 3, A is unique and must equal 1. 

The usefulness of Theorem 5 is apparent since it will be easier to find A's 
and v's satisfying (15), (26), and (33) rather than find minimal solutions to 
(15). 

The algorithm follows. First we solve for any a, in terms of a, and a,; for 
instance, for k = 3, giving 
(34) Void, — Keae + a3 = O 
with 0 < v2; < a2, 0 < Kz < a; by (10), easily done for example as in (4). 

Next construct 
(35) — Kya; + 0202 + 01303 = 0 
where 


a, » 
13 = [s] ’ Ky = G2 — 023013, €1 = Kis + 012 











396 S. M. JOHNSON 


so that A = 1 in (32). If Ky, > va, then K, = Li, Ke = Lo, and L; can be 
found by (26). Then apply Theorem 4 for B(a;, a2, a3). If Ky; < v2, note that 
K, 402. For if K,\ve,, then since K; = dz — v2;013, K,\a2. But then in (34) 
K,\a3. Thus (a2, a3) > K, > 1, by (17) a contradiction. 

Therefore if K, < v2; we can construct another equation 





(36) (vor — PKy)ai — (Ke — poi2)ae + (1 + pois)as = 0 


with 
= i oe 
P= [zI. 


Since ve; — pK, > 0, K2’ = Ke — poy. forms a smaller value of Ke in (34). 

Note that the pair of equations giving the smallest values of K, and K, 
will still give X = 1 in (32). At each stage we repeat the above generating 
of a smaller K, or Ke until eventually K, = L;, Ke = Le. By Theorem 5 this 
will come about when we obtain equations of the type (34) and (35) with 
K, > v2, and Ke > 22. 

To illustrate we find B(137, 251, 256). First calculate that 

a, — 75a2 + 73a; = 0. 


Then by the algorithm we obtain 
3a; + 3laz — 32a; = 0, 
7a; — 13a2+ 9a; = 0, 
17a, + 5a. — l4a; = O. 





Thus the matrix of detached coefficients is 


a; ae a 
—24 8 5 
7 -13 9 
17 5 -—14 


and B = 24a; + 9a; = 5,592. 
It should be pointed out that solving for (34) is not always necessary. 
Many computational short cuts become apparent after some practice. Note 





that the suggested algorithm is not merely numerical but gives algebraic ) 
relations as well, enabling one to solve all previously solved special cases for 
t = 3 by a unified approach. For example, see the end of the next section. 


6. Extensions and restatement of basic theorem. Even if L; = 1, the 
statement of Theorems 4 and 5 still holds, dropping the minimality con- 
dition on the L,. In this case, B = aa2 + a3, see (2). But the matrix of co- } 
efficients is 











ary. 
Note 
raic 
; for 


. the 
con- 
f co- 











A LINEAR DIOPHANTINE PROBLEM 397 





a; ae a3 
—GAs ai 0 
ae — X31 —@, — X32 1 
X31 X32 —1 


with x3; < ad. Then A = 1, so that Theorem 5 gives the same result a,a2 + az. 

Next we show that Theorems 4 and 5 hold even though the a,'s are not 
reduced to a pairwise relatively prime set },, be, d3. 

We compare the L’s and x,,'s associated with a, a2, a3 with those L’’s and 
x,;'s associated with },, be, b3. From (15’), Lay = xya;+xudy, we see that 
dyx\Li, dij\Xu, da|lxiy. Thus, setting L; = dgL,’, xu = diy’, we have 
(35) Lily — XpXxy = a, if and only if LL, — x'yxi, = by, 
since dydy(L, L,’ = % yn'Xnj) = dd yb; = di. 

Finally, all these results can be collected in the following form: 

THEOREM 6. For (a;, @2, @3) = 1, define B to be the largest number not of the 
form xa, + Yd + 203, x,y,z > 0. Then for cyclic permutation of subscripts 

B= La; + max (x pdz, X25), 
where 
Lay = yy + Xudy, L,>0,x%3>0,x%n>0, Ly = x5, + Xe; 
and 
Lily — XxX qq = Ay. 

The L’s and x's can be found either by the computing algorithm discussed in 
§5, modified to solve first for d;,a, in terms of a, and a,, or by first applying 
Theorem 2. 

In conclusion, observe that the special cases previously obtained for t = 3 


can be derived directly from the results of this paper. 


Example. We can extend the results stated in (5) for B(a, a + 1, a + 2). 
Write a = kz —u, O<u<2, k>1, 2 >2. Then for u < k +1 the co- 
efficient matrix is 


a=a,=kz—u ao = kz —ut+l1 a; = ke -—-u+z 

—(z+k —u) 2—u k—1 
z—l1 —2Z l 

k+1-—u u —k 


If u < 1, then 


B = Lya3 + X1202 = (s+) (a +2) + (¢ — u)(a +1). 











398 S. M. JOHNSON 


To correspond to the notation of (5), we solve for C+ 1 = B+ 1 — > a, 
Then 


C+1= (24-2) a+ (ze —2— u)a. 
Ifu> 1, then B = Lyaz3 + x2)a; = R(a + 2) + (2 — 1L)a, and 


C+1= E ay (a + 2) + (2 — 3)a 


(2+2) ye jets) 41. 


For t > 3, Theorem 1 holds and the author has verified that relations 
analogous to Theorem 4 hold in many cases. However, this will be the subject 
of a later paper. 


since 


REFERENCES 


1. P. T. Bateman, Remark on a recent note on linear forms, Amer. Math. Monthly, 65 (1958), 
517-518. 
A. T. Brauer, On a problem of partitions—I, Amer. J. Math., 64 (1942), 299-312. 
3. A. T. Brauer and B. M. Seelbinder, On a problem of partitions—II, Amer. J. Math., 76 
(1954), 343-346. 

4. R. J. Levit, A minimum solution for a diophantine equation, Amer. Math. Monthly, 63 
(1956), 646-651. 

. J. B. Roberts, Note on linear forms, Proc. Amer. Math. Soc. (1956), 465-469. 

- On a diophantine problem, Can. J. Math., 9 (1957), 219-223. 


an 


The Rand Corporation 
Santa Monica, California 














ns 
ct 








ARITHMETICAL INVERSION FORMULAS 
ECKFORD COHEN 


1. Introduction. Let m and r be integers, r positive, and define the core 
y(r) of r to be the product of the distinct prime factors of r (y(1) = 1). Let 


f(n,r) be a complex-valued, arithmetical function of m and r. If for all n, 
f(n,r) = f((m,r), r) then f(m,r) is called an even function (mod r), and if 
f(n,r) = f(y(n,r), 7) for all n, y(n, r) = y((n,1)), then f(n,r) is said to be 


a primitive function (mod r). Clearly, both classes of functions are subclasses 
of the periodic functions (mod r), while the primitive functions form a sub- 
class of the even functions (mod r). 

In a series of three papers (3; 5; 6) the author developed parallel, though 
interrelated, trigonometric and arithmetical theories of the even and primitive 
functions (mod r). It was shown (3, Theorem 3) that f(m,r) is even (mod r) 
if and only if it possesses a representation of the form 


(1.1) fan=> Ka, r) 

d\(n,r) d 
and that f(m, r) is primitive (mod r) if and only if it possesses a representation 
of the form (5, Theorem 8), 


(1.2) f(n,r) = >» c(a,*). 

d\y(r) d 

(d.n)=1 
It is the purpose of the present paper to develop a purely arithmetical theory 
of these two classes of functions, built on the unifying idea of arithmetical 
inversion. 

More precisely, the method of the paper is based on two arithmetical 
inversion principles, the first (Theorem 2.1) relating to the class of all even 
functions (mod r), while the second (Theorem 2.3) is limited to the primitive 
functions (mod r). We remark that the first of these two results becomes 
equivalent (Corollary 2.2) to the ordinary Mébius inversion formula in case 
f(n,r) is restricted to the subclass of completely even functions (mod r), that 
is, functions satisfying f(m,r) = f(m’,r’) for all m, n’, and all positive r, r’ 
such that (m,r) = (n’,r’). An analogous result (Corollary 2.5) is proved for 
the completely primitive functions (modr), that is, functions satisfying 
f(n,r) = f(n’,r’) for all n,n’ and all positive r,r’ such that y(r)/y(,7r) = 


, 


x(r')/y(n', 1’). 


Received January 16, 1959. 


399 











400 ECKFORD COHEN 


The characterizations (1.1) and (1.2) of the even and primitive functions 
(mod r) follow as immediate consequences (Theorems 2.2 and 2.4, respectively) 
of the above-mentioned inversion relations. Moreover, it also follows that the 
functions, F(r;, 72) and G(r, r2), are uniquely determined, under appropriate 
restrictions on the integral variables 7; and re. 

Sections 3 and 4 are devoted to proofs of generalizations of three funda- 
mental identities in the arithmetical theory of even functions. These identities 
are stated as follows. Let u(r) denote the Mébius inversion function and ¢(r) 
the Euler totient; then 


si r\ _ o(r)u() _ 
(1.3) x(n, r) = > u(5) a ®(n,1r), 
where 6 = r/(n, 71); 
i] ,; 
(1.4) o(r) p> 3@) d:) = u(r)x(n,r); 
(d.n)=1 


1.5) w(d) _ ro((m.r)) 
ar o(d) o(r)(n, 7) 
(d.n)=1 
Formula (1.3) is Hélder’s relation (7), which asserts the equality between the 
Dedekind-von Sterneck function ®(m,r) and Kluyver’s function x(n, 1r), or 
equivalently, the arithmetical form of Ramanujan’s sum. The identity (1.4) 
is due to Brauer and Rademacher (2; 6, § 5), while (1.5) is due in the case 
n = 1 to Landau (8, p. 182); for a proof of the extended form (1.5), we 
mention (4, Theorem 9). In the sequel, these three relations will be referred 
to as the Hélder, Brauer-Rademacher, and Landau identities, respectively. 

In Theorem 3.1 we give a new proof of a generalization of the Landau 
identity, proved originally in (5). The proof given in this paper is based on 
the theory of arithmetical inversion. As a consequence of the generalized 
Landau identity, we obtain in Theorem 3.2 a wide extension of the Brauer- 
Rademacher identity. 

In Theorem 4.1 we give a new proof, based on arithmetical inversion, of a 
generalization of the Hélder relation, due to Anderson and Apostol (1). The 
generalized Landau identity is also used in the proof of Theorem 4.1; more- 
over, a second proof of this identity is included in § 4, preceding the statement 
of the extended Hélder formula. The results of the paper are illustrated with 
a special case in § 5. 

It is emphasized that the discussion of this paper is independent of the 
theory of even functions previously developed. We also mention that the 
results of the present paper remain valid when the field of values, assumed 
here to be complex, is replaced by an arbitrary field of characteristic 0. 


2. Arithmetical inversion of even functions (mod r). We now prove 
a general inversion principle for the even functions (mod r). 











TI 


so 














ARITHMETICAL INVERSION FORMULAS 401 


First we recall the characteristic property of u(r), 
(2.1) p> u(d)=1 or 0 


according asr = lorr> 1. 


THEOREM 2.1. Let 11, ro denote positive integral variables. 
(A) If F(ri, r2) is an arbitrary function of r;, r2 and f(n, r) is an even function 
(mod r) defined by 


(2.2) (n,r) = >> a, £) 


d\(n,r) 


then F(r1, r2) has the form, 


(2.3) F(r, 72) = + i (2 , r) u(d), r= fifo. 


diri 
(B) Conversely, if f(n,r) is an arbitrary even function (mod r) and F(r,, r2) 
is defined by (2.3), then f(n,r) has the form (2.2). 


Proof. (A) Assume first that f(m,r) is defined by (2.2). Then, placing 
r = ryro and using (2.1), it follows that 


FAG) ae = F oe), EAP 5) 


D\((ri/d).r) 


> r(p, 5) YS wld) = Fr, 12). 


Dir d@\(ri/D) 


Thus (A) is proved. 


(B) Assuming F(r;, 72) to be defined by (2.3), we have, again by (2.1), 


Zed) =2, EAS 1) em 


= 2, S(E, 1) ) w(D) = »» S(E,r om u(D) = f((n,r),r), 


d\n, (n,7) - ((n, 7) /B) 
ph thd 


so that by the definition of an even function (mod r), (B) is proved. 


We are thus led immediately to a characterization of the class of even 
functions (mod r). 


THEOREM 2.2. A function f(n, 1) is even (mod r) if and only if it has a repre- 
sentation of the form (2.2). Moreover, the function F(r;, r2) is uniquely determined 
by (2.3) for positive values of r; and ro. 


Replacing F(r;, r2) by F(r1) and f(n, r) by g((n, r)), we obtain from Theorem 
2.1, with rz = 1, the following inversion formula for the completely even 
functions (mod r). 











402 ECKFORD COHEN 
COROLLARY 2.1. Jf F(r) is a function of a positive integral variable r, and 
f(m,r) 1s a completely even function (mod r) defined by 
(2.4) f(m,r) = g((n,r)) = D> Fd), 
d\in.r) 


then F(r) has the form 


(2.5) F(r) = pe ie r) u(d) = 2» () u(d). 


Conversely, if f(n, r) = g((n, 1r)) is completely even (mod r), and F(r) is defined 
by (2.5), then f(nm,r) has the form (2.4). 


Replacing (m,r) in (2.4) by r, Corollary 2.1 becomes the ordinary Mdbius 
inversion formula. In fact, 


COROLLARY 2.2. The inversion relation of Theorem 2.1 is equivalent to the 
Mébius inversion formula, provided the class of functions f(n,r) is restricted to 
the completely even functions (mod r). 


We also have by Corollary 2.1, the following analogue of Theorem 2.2 
(cf. 5, Theorem 4). 


COROLLARY 2.3. A function f(n,r) is completely even (mod r) if and only if it 
is representable in the form (2.4). The functwm F(r) is uniquely determined by 
(2.5) for r > 0. 


The following lemmas are needed in the proof of the inversion theorem for 
the primitive functions (mod r). 


Definition. An integer r is said to be primitive if r contains no square factors 
> 8. 


LemMa 2.1. If r = rire, ely(r), and r; is primitive, then 


x(re,d) = ) ru(ri)/y(r) \¢ = 11) 
dirnAtr) ™ Lo (e # T;). 
e.r/d)=1 
LEMMA 2.2. If r = ryre then 
(2.6) > x(ri,d) = ry(re). 
de=r 
gee 


LEMMA 2.3. If r is primitive, ro|r, and r,\ro, then 
( 
r as 
2 u(d) = | AA ) if n=(nr,n=r, 
Siri.dire | 0 otherwise. 
In view of the multiplicative property of u(r) and x(n, r) as functions of r, 


it is sufficient to verify the above lemmas in the case that r is the power of a 
prime. The details are omitted. 























ARITHMETICAL INVERSION FORMULAS 403 


THEOREM 2.3. Let 11, r2 represent positive integral variables, r, primitive. 
(A) If G(ri, r2) is an arbitrary function of r;,r2 and f(n,r) is a primitive 
function (mod r) defined by 


(2.7) fan=> oa r) = > o(4 s). 
div(r) d 4a\(v0n)) /Cy(. 7))) d 


(d.n)=1 
then G(r,, 2) has the form, 


. r)plr,) 
(2.8) G(ri, 72) = V(r)u(r1) yy A(* , r)xlrn d), Y = Tio. 
di((rriy/(v))) C 


r 
(B) Conversely, if f(n,r) is an arbitrary primitive function (mod r) and 
G(ri, re) is defined by (2.8), then f(n, r) has the form (2.7). 
Proof. (A) Assume that f(m, 7) is defined by (2.7), and let 7(r;, r2) denote 
the right member of (2.8). Then 


T(r, 72) = vr )u(rs) p » ( ik 6. ")) x (ro, d) 
r @i((rrid/(yv(r))) e\y(r) é 


(e.(r/d))<1 


v(r)u(rs) r 
= - “> Gle- > ~—x(r2, d). 
r eiyi(nr) e di((rrid/(yv(r))) 

(e.(r/d))=1 
Application of Lemma 2.1 yields T(r, r2) = G(r, r2), which proves (A). 
(B) Assume G(r;, r2) to be given in the form (2.8) and denote the right 


member of (2.7) by S(m, r). 


WF wd) SO At ir)x(5 6) 


S(2,7) = — 
r ai7 e\(ldr 
(d.n)=1 
(r) r 
= > ( .r) p 2 u(d) oo a(S ) 
r P e di7r) D r/d).e D 
(d.n)=1 


e 
[e) PF u(d). 


(d.n)=1 
d\(r/D).8 


| 

=" 

is 
M 
— 
Fie 
a | 
. 
ad 
M 

S 
= 


By Lemma 2.3, the innermost sum of the last expression is 0 unless y(r, 2) 

y(n,r), y(r/D) = y(r), and under these conditions it has the value 
u(y(r)/y(n, r)). Moreover, since f(m,r) is primitive (mod r), we must have 
then f(r/e, r) = f(y(r/e), r) = f(y(n,r), r) = f(n, r), and it therefore follows, 


with m = y(r)/y(n,r), that 


Sia, 7) = ¥(r)u(m)f(n, 1) > > pi(£) 


r e\lr Die 
y(r/e)=—7(n,7) y(r/D v(r 


Note that the conditions e|r, y(r/e) = y(n, r) are equivalent to the conditions, 
e\(r/y(n,r)), (r/e, y(r)) = y(n, 7r). Similarly, y(r/D) = y(r) and D\|(r/y(r)) 











404 ECKFORD COHEN 


are equivalent conditions for a divisor D of r. Therefore by definition of x(m,7r), 
one obtains 


_ (r)u(m)f(n, 1) (.. ) 
S(n, r) oa a.m ¥(r) ‘ . : 


r 


(8,m)=1 


Thus by Lemma 2.2, 





_ viel) f(r) rum) _ 4 
S(n,r) = : me = f(n,r). 
This completes the proof. 


As a consequence of Theorem 2.3, we have the following characterization 
of the class of primitive functions (mod r). 


THEOREM 2.4. A function f(n,1r) is primitive (mod r) if and only if it has a 
representation of the form (2.7). Moreover, the function G(ri,1r2) is uniquely 
determined, provided r, and rz are positive and r, is primitive. 


Corresponding to Corollaries 2.1, 2.2, and 2.3 in the case of the completely 
even functions (mod r), we deduce from Theorem 2.3 the following analogous 
properties of the completely primitive functions (mod r). 


CorROLuLary 2.4. If f(n,r) = k(m) is a completely primitive function (mod r), 
m = (r)/y(n,1), then 


(2.9) fa,r) = > G@~@eG(n) = > A" r) (2), 


divyir) diri 
(d.n)=1 


where G(r;) is defined for primitive integers r,. 
Remark. The equivalence in (2.9) is to be interpreted in the same precise 
sense as Theorem 2.3. 


Proof. Place r = 171, rz = 1 in (2.7) and note that x(1,d) = u(d). 
Formula (2.9) may be reformulated as 


dai\ri 

where r; is primitive and m is defined as in Corollary 2.4. Hence one obtains 

COROLLARY 2.5. If r is primitive, then the inversion relation of Theorem 2.3 
is equivalent to the Mébius inversion formula, provided f(n,r) is restricted to 
the completely primitive functions (mod r). 

CorROLLARY 2.6 (cf. 5, Theorem 10). A function f(n, r) is completely primitive 
(mod r) if and only if it is representable in the form (2.7) with G(r;, re) = G(r;). 
The function G(r;) is uniquely determined for positive, primitive ry. 


3. The generalized Landau and Brauer-Rademacher identities. We 
first introduce some notation. Let g(r) and hA(r) be functions of r and define 











Ve 
ne 








ARITHMETICAL INVERSION FORMULAS 405 


(3.1) f(n,r) = nadre( r) A: r) Fy) = f(0,r). 


d\(n,r) 


Definition. A function f(r) is said to be completely multiplicative if f(1) = 
f (rire) = f (rif (r2) for all ri, re. 


We now recall two simple lemmas proved in (5, § 4). 


LemMMA 3.1. If h(r) is completely multiplicative, then 


(3.2) F(r) = u(- ; +.) F(r(¢)). 

LemMMA 3.2. If g(r) is multiplicative, h(r) is completely multiplicative, and 
for all primes p, h(p) # 0, h(p) ¥ g(p), then F(r) # 0 for all r. 

We now prove a theorem which generalizes the Landau identity, (1.5). 


THEOREM 3.1 (5, Theorem 9). If g(r) and h(r) satisfy the conditions of Lemma 
3.2, then 


e(@) ) " h(r)F((n, a 
(3.3) > (1 u (@) = Fe)a((m,7))’ 
(d.n)=1 


Remark. Since u(r), g(r), and h(r) are multiplicative, it follows that F(r) 
is also multiplicative. 


Proof. Denote the right member of (3.3) by J(m,1r); in view of the non- 
vanishing of F(r) and h(r), J(n, r) is properly defined. We verify by Lemma 3.1, 
and the multiplicative property of A(r) and F(r), that 


h(n) ( x(r) ) 

J(n,7) = =— | m = ——- } . 

( F(m) y(n, r) 

Hence J(n,r) is completely primitive (mod r) and we may apply Corollary 
2.4. In particular, we have 


(3.4) J(n,r) = > Gd), 


a\iy(r) 
(d.n)=1 


where, assuming 7; primitive, 


: ri\ _ = h(d) (2). 
(3.5) G(r) = » A, rs) a(2) = » F(a) 
Hence, by the multiplicativity of u(r) and F(r), and by Lemma 3.2, 
G(r.) = 42. F n@ art :) 
" om” 


= FS hd ula) , hD)e6@)u66). 
Dé /d) 


=(TI 











406 ECKFORD COHEN 
The complete multiplicativity of A(r) gives, with Dd = E, 


oie) = FED Be mene aCe) Fat 


Biri d\z£ 


Hence by (2.1), 


; _ # (ri)g(rs) 
(3.6) G(r;) = F(r;) . 


By (3.4) and (3.6) the theorem is proved. 


COROLLARY 3.1 (m = 1). Under the conditions of the Theorem, 


. e(d) ) 27) — Ar) _ 
(3.7) pe (2 p (d) = Fir) = J(i,r). 


Next we prove a generalization of the Brauer-Rademacher identity (1.4). 
THEOREM 3.2. Under the conditions of Lemma 3.2, 

, . h(a) (:) = ale 

(3.8) F(r) > F(d) * * le u(r)f(n, r). 


d\r 
(d.n)=1 


Proof. Denote the left member of (3.8) by Q(n,r). Let r; and rz be the 
uniquely determined positive integers such that r = ryro, y(r2) = y(n,7r), 


(r:, 72) = 1. Then on the basis of Corollary 3.1 and the multiplicative property 
of u(r), 
= h(d) (:) 
Q(n,r) = F(r)u(r2) » Pid) “"\g@ 
= ri) 5 g(D)u(D) 
= F(r)pu(re) > (") > F(D) 


With d = DE, one obtains then 


. (D)u"(D) ID 
Q(n,r) = F(r)w(rs) 2 hot” Pn (ne), 


; F(D) E 
so that by (2.1) and the multiplicative property of u(r) and F(r), 
(3.9) Q(n,r) = F(re)u(r)u(ri)g(ri). 


By definition of F(r) and the multiplicativity of u(r), g(r), it follows that 


Q(n,r) = w(r)u(rs)g(rs) Do navel ®)u(22) = u(r) > nide( 2) (2) 


In view of the presence of the factor u(r) and the fact that y(r2) = y(n, r), 


one obtains then 


Q(n,r) = u(r) 7 nide(£) As) = p(r)f(n,r). 


d\(n.r) 


The theorem is proved. 




















ARITHMETICAL INVERSION FORMULAS 407 


4. The generalized Hélder identity and a second proof of the general- 
ized Landau identity. In the proof of the generalized Landau identity (3.3), 
we used as starting point the right member J(n,7r). This was the natural 
approach, relative to the application of (3.3) in proving Theorem 3.2, because 
it was J(m,r), with m = 1, that arose in the proof of that theorem. We shall 
also use (3.3) in the proof of the generalized Hélder theorem below. However, 
in this proof, it is the left member of (3.3) which arises; therefore, it is proper 
to give another proof of the generalized Landau identity, proceeding from 
the left side of (3.3). 

Second proof of the generalized Landau identity, Theorem 3.1. Denote the 
left member of (3.3) by S(n, r). We obtain then by the multiplicative property 
of F(r) and g(r), with m = y(r)/y(e), e = (n,71r), 


£2 wit) 
p>: oF Fie 2, s(d)F 
e m/d (=/2) 
F(m) 2d £ d) 2, HD) (4) D 
m/D 
For * D)e(™) wos (me ). 


Hence by (2.1) and multiplicativity, 


S(n, r) 


h(m) _ h(y(r))F(y(e)) 
4.1 S(a,7) = > = 
— (1) = Fm) ~ Fly(r) hve) 
Multiplying both numerator and denominator of the last expression in (4.1) 
by h(r/y(r))k(e/y(e)), one obtains, by the complete multiplicativity of A(r) 
and by 3.2, 
: h(r) F(e) = 
S(n,r) = F(ryh(e) (e = (n,r)), 

which is (3.3). The proof is complete. 

We shall need the following lemma in the proof of the generalized Hélder 
identity. 


Lema 4.1. Under the conditions of Lemma 3.2, if a and b are positive integers, 
then 


, F(a) F(b)h((a, b)) 
4.2 wo ONE 
(4.2) F(ab) F(a, b)) 

Proof. In view of the multiplicative property of the functions concerned, 
it suffices to verify (4.2) in case a = p', b = p’, p prime, ¢ 2 s > 0. Since 
h(r) is completely multiplicative, it follows that for g > 0, F(p*) = hA*'(p) 
h(p)—g(p)). Hence by Lemmas 3.1 and 3.2, one deduces, for the above 


values of a and 3, 











408 ECKFORD COHEN 


F(a)F(b)h((a, b)) _ F(p")F(p" i 
F(@,b)) ray = Fp yh) 


= h***""(p) F(p) = F(p**") = F(ab). 


By multiplicativity, the lemma follows for arbitrary values of a and 3. 
We now prove the following generalizations of Hélder’s identity (1.3). 


THEOREM 4.1 (1, Theorem 2; 5, Theorem 2). Jf g(r) and h(r) satisfy the 
conditions of Lemma 3.2, then 








; F(r)g(8)u(6 , 
(43) f(a, 1) = SOE eS) (6 = r/(n,7)), 


where f(n,r) is defined by (3.1). 


Proof. Denote the right member of (4.3) by T(m,r). Evidently T(n, r) is 
even (mod r). Hence by Theorem 2.1, 7(m,r) has the representation, 


(4.4) T(n, r) => ala, s), 


d\(n,r) 


where, with r = rire, 


H(ry, r2) 


} (2 ; r) u(d) 
ain d 
g (rod) u (red) u(d) 
F : 
”) my F (red) 
But by definition of u(r) and multiplicativity, it follows that 


_ F(r)g(rs)u(r2) (s@ . : 
H(r1, 172) = F(rs) »D F(@) (d). 








(d, r2)=1 
Applying Theorem 3.1 one obtains 


F(r)g(r2)u(r2) h(r1) F((ri, r2)) 
F(r2) F(ri)h((ri, r2)) ’ 








H(n, re) _ 
so that by Lemma 4.1, 
(4.5) H (ry, r2) = h(ri)g(r2)u(r2). 


The theorem follows from (4.4) and (4.5) and the definition of f(m, r). 
Combination of (3.8) and (4.3) yields the following result. 


CoROLLARY 4.1. Under the conditions of the Theorem, 


h(d) (2) - 2Ox@e@) (s - r ) 
(4.6) >» Fa@"\a) =~ F@) a 


(d.n)=1 








5. A special case. In this section we illustrate the results of §§ 3 and 4 | 
with a particular example. Let J(r) = ¢2(r) denote the Jordan totient of 








ARITHMETICAL INVERSION FORMULAS 409 


rank 2. We recall the following identity proved in (5, Corollary 24; ¢ = 
n = 1): 
(5.1) > u(d)o(d) _ ro(r) 


a Jd) = J(r)’ 


Placing h(r) = 1 and g(r) = ¢(r)/J(r) in (3.3), (3.8), (4.3), and (4.6), 
respectively, one obtains on the basis of (5.1) the following relations. 








2 
5.2 u(d)_ J(r) (: r) ((n, ”)) 
aan a d ~rer)\ J((n,r)) /' 
ais Ie) & @ea "fn JI@ ' 
(d.n)=—1 de=r 
(5.4) w(e)o(e) _ o(r)(n,r) (*..) | 
‘i alan J(e) J(r) (nm,r)/ ’ 
de=T 
(5.5) ule) J(d) _ u(r)(m 7) =: -) 
as & dd ry *\(n,r)/° 
(d.n)=1 


REFERENCES 


1. Douglas R. Anderson and F. M. Apostol, The evaluation of Ramanujan's sum and its 
generalizations, Duke Math. J., 20 (1953), 211-216. 

2. A. Brauer and H. Rademacher, Aufgabe 31, Jahresbericht der deutschen Mathematiker- 
Vereinigung, 35 (1926), 94-95 (supplement). 

3. Eckford Cohen, A class of arithmetical functions, Proc. Nat. Acad. Sci., 41 (1955), 939-944. 

— Some totient functions, Duke Math. J., 23 (1956), 515-523. 

5. — Representations of even functions (mod r), I. Arithmetical identities, Duke Math. J. 
25 (1958), 401-421. 


> 


6. — Representations of even functions (mod r), II. Cauchy products, Duke Math. J., 26 
(1959), 165-182. 
7. O. Hélder, Zur Theorie der Kreisteilungsgleichung, Prace Matematyczno-Fizyczne, 43 


4e 


(1936), 13-23. 
8. Edmund Landau, Ueber die zahlentheoretische Funktion o(m) und ihre Bezichung cum 
Goldbachschen Satz, Géttinger Nachrichten, (1900), 177-186. 


University of Tennessee 








THE NUMBER OF k-COLOURED GRAPHS ON 
LABELLED NODES 


R. C. READ 


Introduction. By a labelled graph we shall mean a set of ‘“‘nodes,”’ 
distinguishable from one another and denoted by A, Ao,...., and a col- 
lection of “edges” viz., pairs of nodes. We say that an edge “‘joins’’ the pair 
of nodes which specifies it. We further stipulate that at most one edge joins 
any two nodes, and that no edge joins a node to itself. 

By a “colouring” of a graph in & colours we shall mean a mapping of the 
nodes of the graph onto a set of k colours C;, Co,..., C, such that no two 
nodes which are joined by an edge are mapped onto the same colour. A graph 
so coloured in exactly & colours will be called a k-coloured graph. Since it is 
usually possible to colour a graph in more than one way, there will, in general, 
be many -coloured graphs corresponding to a given graph. 

The object of this paper is to derive an expression for the number of labelled 
k-coloured graphs on a given number of nodes. This is a generalization of a 
result given by Gilbert (2, § 1). Suppose we are given a set of m nodes A, 
Ao,...,An, a set of positive non-zero integers m,, m2,...,m, such that 
my + mo+...+m =n and a set of integers eag(a,8 = 1,2,...,k). We 
shall count the number of k-coloured graphs on these m nodes which are 
such that m,. nodes are allocated the colour C, and egg edges join nodes allo- 
cated the colour C, to nodes allocated the colour Cy, (a, 8 = 1,2,...,k). We 
let E = So a<séag be the total number of edges. 

First allocate the colours to the various nodes. This is possible in 


n,'!no!... m,! 


different ways. Next consider the number of ways of choosing e.3 edges joining 
nodes coloured in C, and Cs. There are mang possible edges, so the choice can 


Caf 
different ways. Thus the total number of graphs is 


° stor (2). 


Cas 


be made in 





Received April 4, 1959. 
410 








EE 





ng 
‘an 


























GRAPHS ON LABELLED NODES 411 


To find the number of graphs having E edges we must sum expression (1) 
over all sets {éas} such that Seas = E. Since 


Caf 
af 


t in (1 +2)", 


this sum is the coefficient of ¢” in 


is the coefficient of 


n! 


Nan ! z 
— TI +4" = ————- (1 +) 
Ny: a<8 1-762. «+ « ke 


n'no! cece? 


and is therefore 


since 


x mans = 4(X n.)* — 4 12’. 


In the special case when there are nm colours and each node receives a 
different colour we have n, = mz =... = m, = 1 and (2) reduces to 


1 = 
ni( inn ») : 


Since, under these conditions, every graph on m nodes can be coloured in n! 
different ways, the number of graphs on n labelled nodes having E edges is 


seen to be 
(inc ~_ )) 
E : 


This result is easily obtained directly (2, p. 405). 
To find the total number of k-coloured graphs on nm nodes and E edges 
we need to sum (2) over all sets {m.} such that }°». = n. Thus we obtain 
ae ("" - Zn’) 
(ny 3 !o! . . . my! E 
but it does not appear that this formula is very amenable to manipulation. 
Let us now remove the restriction on the number of edges in the graph, 
and consider the number of k-coloured graphs (whatever the number of 
edges) which are associated in the above way with the set {n,}. This number 
is obtained by summing (2) for all possible values of EZ, and is thus 


(3) n' ohn?-420? 








412 R. C. READ 


The total number of k-coloured graphs on n labelled nodes can now be 
found. We denote it by F,(k), and we see that 


n! 








_ og x42" 
i“ 


F,(k) 


(n) n!n-! ° 


7 om — . 
n'2*"* times the coefficient of x” in 


a g4 : k 
2d, s! 7 : 
Hence 


= 2 - ~ qt P 
(4) > 2 F,(k) = = | | 


from which F,(k) may be calculated. 





2. For convenient calculation of F,(k) we may write (4) as 
ota’ p x" = ( SF ott i “( St gts? “) 
> . Fa(k) n! » c Filk 1) r! 2d, * s! 
whence, equating coefficients of x", we obtain 
n—1 
(5) F,(k) = DU (") 2°" F,(k — 1) 
r=} 


which gives the numbers of k-coloured graphs in terms of the numbers of 
(k — 1)-coloured graphs. Some values of F,,(%) are given in Table I. 








TABLE | 

k 

n 1 2 3 4 5 6 7 

1 1 0 0 0 0 0 0 
21 4 0 0 0 0 0 
3 1 24 48 0 0 0 0 
4 1 160 1152 1536 0 0 0 
5 1 1440 30720 122880 122880 0 0 
6 1 18304 1152000 10813440 29491200 23592960 0 
, 3 65630208 1348730880 10569646080 


330624 


7707033600 


15854469120 


3. If we wish to count the total number of graphs coloured in k or fewer 
colours, we proceed as before but remove the restriction that the m,’s are 
non-zero, and allow them to be any non-negative integers. Denoting the 
required number of graphs by M,(k) we obtain 


re) a? x" oO ; 2 x" k 
(6) ss M,(k) = = | 27 Ag 


n=0 





ou 





— 


eee eee 











GRAPHS ON LABELLED NODES 413 


By the method of § 2 we obtain from (6) the relation 


(7) M,(k) = > (") 2° (k — 1) 


r=0 
with Mo(k) = 1. 

M,(k), unlike F,(&), is a polynomial in k of degree n. This follows either 
from (7) by mathematical induction, or from the fact that M,(k) is the 
sum of the chromatic polynomials* of all graphs on m nodes, each polynomial 
being counted as many times as there are ways of labelling the corresponding 
graph. Some values of M,(k) are given in Table II. 


TABLE II 
k 
n 1 2 3 4 5 6 7 8 Q 
Pe 2 3 4 5 6 7 & y 
2 1 6 15 28 45 66 91 120 153 
3 1 26 123 340 725 1326 2191 3368 4905 
4 1 162 1635 7108 20805 48486 97447 176520 296073 
5 1 1442 35043 254404 1058885 3216486 7986727 
6 1 18306 1206915 15531268 
7 1 330626 66622083 
The first four polynomials are 
M,(k) = k, 
M2(k) = 2k? — k, 
M,(k) = 8k* — 12k? + 5k, 
and 


M,(k) = 64k* — 192k* + 208k? — 79k. 


4. If f,(k) denotes the number of connected k-coloured graphs, it can be 
shown by the methods used in (2) that 


' © : n oo x"\ 
(8) > F,(k) = = exp) > fnrlR) = 


with Fo(k) = 1. 


Differentiating both sides of (8) and equating coefficients of x"~' we obtain 


r=1 


*For the definition of the chromatic polynomial of a graph see (1). 











414 R. C. READ 


(9) fall) = Fak) — ¥ (" ri 2 Fy—r(k)f(k) 


r=1 


giving f,(k) in terms of f,1(R), fr_2(R), when F,(k), F,»-1(&),..., are known. 


REFERENCES 
a. i. 
2. E. N. Gilbert, Enumeration of labelled graphs, Can. J. Math., 8 (1956), 405-411. 


University College of the West Indies 





Whitney, A logical expansion in mathematics, Bull. Amer. Math. Soc., 34 (1932), 339-362. 














——__—____—___—___——_,— 


DISCRETE GROUPS OF MOTIONS 
LEON GREENBERG 


1. Introduction. This paper deals with the discrete groups of rigid 
motions of the hyperbolic plane. It is known (12) that the finitely generated, 
orientation-preserving groups have the following presentations: 


Generators: ae ee ee 


Defining relations: eso « Mgdt a < cahens « «Gp @ i, 
Sl=| SY=...= Sf= 1, 


where Rm = Gmbm@m~'b,—'. We shall denote this group by F(p; m,... , mg; 1). 

In particular, the finitely generated free groups are contained among these. 
Indeed, one purpose of this paper is to indicate some geometrical methods 
for investigating free groups. 

The above groups also include the orientation-preserving discrete groups of 
motions of the sphere and Euclidean plane (3). But the results we shall 
obtain are mainly concerned with the hyperbolic groups and are either easy 
or false for the Euclidean and spherical groups. For instance, we shall extend 
the following theorem of Howson (7) to discrete groups of motions: if S and 
T are finitely generated subgroups of a free group, then S () T is also finitely 
generated. This theorem is trivial for the Euclidean and spherical groups, 
which contain no infinitely generated subgroups. We shall generalize the 
theorem of Karrass and Solitar (8) that if F is a free group and H is a finitely 
generated subgroup which contains a normal subgroup of F, then H is of 
finite index. This theorem is trivial for the spherical groups and is false for 
most of the Euclidean groups. For the above reasons we shall consider only 
the case of discrete hyperbolic groups. We shall usually omit “discrete hyper- 
bolic." We mention the interesting result of Nielsen (11), Bundgaard (1), 
and Fox (5) that the above groups all contain subgroups of finite index with 
no elements of finite order. 


2. Hyperbolic groups. Let D be the disk {2| |z| < 1} in the complex 
plane, D its closure and E its boundary. D can be given a Riemannian metric 
so that it becomes the Poincaré model of the hyperbolic plane. The geodesics, 
which we shall call h-lines, are arcs of circles orthogonal to E. The isometries 
are the linear fractional transformations which preserve D. They are of the 
forms 





Received June 29, 1959. This is in part taken from a doctoral dissertation presented to Yale 
University. 


415 











416 LEON GREENBERG 


where ad — bb = cé — dd = 1. 

The transformation S is called a translation if it has two fixed points which 
are on E. This is equivalent to the condition |a + a| > 2. A translation maps 
each circle through the fixed points onto itself. In particular, the A-line through 
the fixed points is invariant and is called the axis of the transformation. S is 
called a rotation if it has a fixed point in D, and a limit-rotation if it has a 
single fixed point on E. These conditions are respectively equivalent to 
la + a| < 2 and |a + aj = 2. T is a reflection in an h-line if d+d = 0. 
Otherwise T is a glide-reflection, that is, the product of a translation along 
an axis \ with a reflection in X. 

For each transformation S or 7, there are a pair of h-lines \ and }’, called 
the isometric circles of the transformation (see (4)). S is the product of a 
reflection in A and a reflection in the perpendicular bisector of the Euclidean 
line through the centres of the circles \ and 2’. If T is a glide-reflection, this 
product must be combined with a reflection in the A-line through the fixed 
points of 7; if T is a reflection, \ = ’ is the A-line of reflection. S is a trans- 
lation, rotation or limit-rotation, according as \ and }’ do not intersect, do 
intersect, or are tangent. A discrete, hyperbolic group G has a canonical 
fundamental region, denoted Rg, which consists oi the region in D outside 
of the isometric circles of all elements of G. 

A subset M of D is called h-convex if with every two of its points it contains 
the A-line segment between them. For any subset M of D, we denote by [M], 
the h-convex closure of M, that is, the intersection of all h-convex subsets 
of D which contain M. 

The set of limit points Lg of a group G is the intersection with E of the 
set of limit points of {g(z)| g © G}, where z is any point in D. This set is 
independent of z € D, because the transformations in G preserve hyperbolic 
distances, and these become arbitrarily small relative to Euclidean distance, 
as E is approached. Lg is a closed set, invariant under G. The convex figure 
of G is the set Kg = [Le] (\ D. This is an h-convex set which is invariant 
under G. 

For each limit-rotation g € G, it is possible to find a limit-circle C, so 
that: 

(a) C, is tangent to E at the fixed point of g, 

(b) C, C Ke, 


(c) If g; and ge are limit-rotations such that go = fgif 


'. where f € G, then 
Ca = fC... 


(d) If 2; and z2 are two points interior to C,, u is the fixed point of g, and 
f(z), where f € G, then f is either a limit-rotation with fixed point 
u, or f is a reflection in an h-line with one endpoint at wu. 


Il 














so 


en 


nd 
int 








DISCRETE GROUPS OF MOTIONS 417 


We shall denote by K*g the region obtained from Kg by deleting the interior 
of each C,. K*g is neither unique nor A-convex, but it is invariant under G. 
We shall say that K*, is compact mod G, if there exists a disk [T = {2 |z| <r 
< 1} such that 
Ke C Gr = Ug. 
o¢G 

This is equivalent to the compactness of the surface obtained from K*¢ by 
identifying points congruent under G. Nielsen (10; 12) has proved that G is 


finitely generated, if and only if K*« is compact mod G. 


It is not hard to see (by constructing the fundamental region) that every 
hyperbolic group F(p;m,...,%«;7) is realized as a group without limit- 
rotations. In fact, according to Nielsen (12), for any finitely generated, hyper- 
bolic group G, there is a homeomorphism s of D, such that sGs~' is a group of 
motions without limit-rotations. When G contains no _limit-rotations, 


K* o _= Kg. 


3. The results. Coxeter (2) and Goldberg (6) have shown that every 
abelian subgroup of the modular group F(0; 2,3; 1) is cyclic. We shall prove 
the following stronger version of this for the discrete, orientation-preserving, 
hyperbolic groups. 


THEOREM 1. If F(p; m, me, ..., Ma; 7) 1s hyperbolic, then the centralizer of any 
element is cyclic. The possible finite orders are the divisors of n,, m2,..., %q. Any 
finite subgroup is a cyclic group, conjugate to a subgroup of (S,), (S2),... , or (S,). 


Proof. It is well-known that two orientation-preserving linear fractional 
transformations commute if and only if they have the same fixed points. 
Therefore the centralizer of a rotation or limit-rotation is a group of rotations 
or limit-rotations with the same fixed point, and the centralizer of a trans- 
lation is a group of translations with the same invariant axis. Each of these 
groups leaves a curve (or curves) invariant—a circle in D, for a group of 
rotations, a limit-circle for a group of limit-rotations, an A-line for a group 
of translations. Because the group is discrete, there must be an element which 
transforms a given point (on the invariant curve) the least distance in a 
fixed direction. This element generates the group, which is therefore cyclic. 


The group F(p; , m2,...,%«;7r) has a fundamental region, which has 
among its vertices, the points 2, 22,...,2, which are fixed points for 
S,, Se, ..., Sq respectively. If z is a fixed point of a rotation S, there is an 


element f which maps z into one of the points z,. Then fSf-' is in the subgroup 
generated by S,. Therefore the order of fSf-', which is the same as the order 
of S, divides n,. 

Let G be any finite subgroup of F(p; m,...,ma;7r), and let 7, and 7; be 
two elements (necessarily rotations) with fixed points ¢, and ¢, in D. We shall 
show that t; = te. Assuming otherwise, let A; be the A-line through ¢, and fz, 
and r; the reflection in 3. There are h-lines \2 and \, through the points ¢;, 











418 LEON GREENBERG 


and te respectively, such that if 7; is the reflection in A,;, then 7; = rer; and 
T2 = ryt}. Therefore 7,72 = rer;. If A, and Az diverge, rer; is a translation; 
if A; and Az are asymptotic (meet at a point on £), then rer, is a limit-rotation. 
Since these are transformations of infinite order, it follows that A, and ), 
must meet at a point ¢; in D, and 73; = rir2 is a rotation whose fixed point 
is ts. The group (7, r2, 73) has the triangle /;fot; as fundamental region. This 
group is infinite, since the images of f,fots; under (r:, r2, 73) cover D; since 
(Ti, T2, T3) is of index 2 in (r;, re, 73), the former subgroup is also infinite. We 
conclude that ¢; = fz, and G is a cyclic group conjugate to a subgroup of 
(S;), (Ss), ..., or (S¢). 

For hyperbolic groups which contain orientation-reversing transformations, 
the only exceptions are the following. The centralizer of a translation or 
glide-reflection can be a product of cyclic groups C,, X C2. The centralizer 
of a reflection can be the group C,, X C2 or F(0; 2,2; 1) K Cz. A finite sub- 
group can be a dihedral group. 


THEOREM 2. If S and T are finitely generated subgroups of a discrete group, 
then S (\ T 1s also finitely generated. 


Proof. Let H = S(\T and let G be the finitely generated discrete group 
generated by S and 7. As we remarked in § 2, we can suppose that G contains 
no limit-rotations. By Nielsen’s theorem, there exist disks 


rs = {z| \s| <r, < 1}, Ter = {2| |2| <r, < 1} 


such that Ks C Sl'sand Kr C 7T 7. Let» = max (r,,7,) and [ = {2/|z| <r}. 
Then 


K,CSIr and K-rCTr. 
Choose coset representatives {s,}, {t,;} so that 


S= UWs, and T = U Mt;. 


Then 
KsC ST=HUS,7, 
K,yC7TTr=A Ut. 
3 
Also 
Ky CKsl\Kr, 
since 


Ly CLs(\Lr. 


We now show that s,° (\ K7 # ¢ for only a finite number of representatives 
s; For any h € H, s,U (\ ht,T ¥ ¢ if and only if [ O\ s;-'ht,T # ¢. Now if 
d(z,, 22) is the hyperbolic distance between the points z; and z2 in D, and the 








ee 


ind 
on; 
on. 

he 
int 
his 
nce 
We 


_ of 


ns, 

or 
izer 
ub- 


up, 


oup 
1ins 


tives 
yw if 


| the 











~ 


DISCRETE GROUPS OF MOTIONS 419 


hyperbolic radius of [ is p, then [ (\ gf # ¢ if and only if d(0, g(0)) < 2p. 
But the discreteness of G implies that there are only a finite number of ele- 
ments g € G with this last property. Therefore there are only a finite number 
of elements g = s;—'ht, with T (\ gl # ¢. Note that if 


—1 -1 
Si hats, = Sis hot js, 


then 
SaSihy = hatyty € SOT = H. 
Therefore s;,5,~' and t,t,~' € H, so 
Su = Stas th = ty 


and h, = hy. It follows that there are only a finite number of the s,, ¢,, & for 
which s,0° (\ ht;T ¥ ¢, and therefore only a finite number of the s, for which 
sl O\ Kr # @. 

Since Ky C Kz, there are only a finite number of the s;, say 54, Sq)... 5 Si 
so that s,T (\ Ky # ¢. Furthermore, the elements of H map Ky and Ks 
onto themselves and consequently Ks — Ky onto itself. It follows that 
si (\ Ky # ¢ if and only if Hs,T (\ Ky # @. Recalling that 


Ky 4 H U s,J, 
we now obtain 
K,zC HU s, YP. 
k=1 


Let I’ be a disk with centre 0 and radius r’ < 1, which is large enough to 
contain 


n 
LJ sal’. 
k=l 


Then Ky C HI”, or Ky is compact mod H. Nielsen's theorem now implies 
that H is finitely generated. 


THEOREM 3. If H is a finitely generated subgroup of G and if Ly = Le, then 
(G: H] ts finite. 


Proof. lf Lg = ¢, then G must be finite. If L¢ consists of a single point z, 
then the elements of G and H are limit-rotations whose fixed point is z, and 
possibly reflections in h-lines with one endpoint at z. It is easy to see that 
the index [G: H] is finite in this case. If Lg contains more than one point, 
then K*, and K*,y are non-empty sets. By Nielsen’s theorem there is a disk 
r = {2| |z|] <r <1} so that K*, C HT. Since G is discrete, there can be 
only a finite number of elements g € G so that [ (\ gl # ¢. We shall show 
that every g € G is congruent mod H to one of these elements, which we 
denote by gi, ge, ..., gn. Let z € I (\ K*yg (we suppose that TI is large enough 











420 LEON GREENBERG 


so that this intersection is not empty) and let g € G. Since Kg = Ky, we 
have K*, C K*y. K*g is invariant under G, so g(z) € K*y. Therefore there 
exists h € H so that hg(z) € T. Thus [ (\ Ag # @ (since Ag(z) © T 7) hg) 
and hg = g, for some k. It follows that [G: H] is finite. 

The iollowing is proved in (4, p. 43). 





Lemma 1. Jf S is a closed subset of E which contains more than one point, and 
S is invariant under a group G, then S D Le. 


Definition. An N-chain of a group G is a sequence of subgroups G;, Go, . . . , G, 
such that: 
(a) G, # {1} (2k = 1,2,...,m), 


(b) either G, is a normal subgroup of G41, or Gey1 is a normal subgroup 


of G,. 


We shall say that two subgroups H and K are N-equivalent if there is an 
N-chain H = G,, Go,...,G, = K. A subgroup which is \N-equivalent to G 
will be called an N-subgroup. 

We shall call a group quasi-abelian if it leaves invariant an h-line or a 
point in D. Such a group is either abelian or has an abelian subgroup of 
index 2. G is quasi-abelian if and only if L¢ consists of 0, 1, or 2 points. The 
following Lemma shows than an N-equivalence class consists entirely of 
quasi-abelian groups if it contains one such group. 





LemMA 2. If G and H are N-equivalent subgroups of a discrete group and G 
ts not quast-abelian, then Le = Ly. 


Proof. Let the N-chain be G = G;, Go,...,G, = H. We proceed to prove 
by induction that 


La - Le (Rk a , n). 


Clearly Le, = Lg. Assume Le, = Le. If Ge C Gea, then Le, C Le,.,. On 
the other hand, Le, is invariant under G,4;. For let g € Gyy1 and 2 € Le. 
There is a sequence {h,;} C G, so that for any z € D, 


on h,(z) = 2o. 

Now gh € G, and | 
im hg (2) = 20, 

so that | 


lim ghg (2) = g(zo). 
jac 


Therefore g(zo) € Le,, and Le, is invariant under G,4,;. Since Lg, = Le and 
G is not quasi-abelian, Lg, contains more than 2 points. By Lemma 1, 


PD eel > WD 


il 





id G 


rove 


, and 








ta 


DISCRETE GROUPS OF MOTIONS 421 


Le, D Lagu: 


Thus 


Le = La = Lagu 


It remains to consider the case where G,,; is a normal subgroup of G,. In 
this case Lg,,, C Le,. Moreover, in the same manner as above we can show 
that Le,,, is invariant under G,. 


We assert that Le,., contains more than one point. If Le,,, = ¢, then 
Gys1 is a finite group. G,4; is either a group of rotations (and possibly re- 
flections) with a common fixed point z € D or a reflection group of order 2. 
In the first case the point z must be invariant under all transformations 
in G,. Then G;, is also a finite group, so that 


Le = Le, = ¢. 


But this implies that G is quasi-abelian. In the second case, G,4; consists of 
the identity and a reflection r in some h-line \. The elements of G,; leave A 
invariant. Lg, is either empty or consists of the endpoints of A. This is true 
also of Lg, so that G must be quasi-abelian. If L¢,,, contains only a single 
point z, then this point is invariant under G,. G, is a group of limit-rotations 
with limit-centre z (and possibly reflections in h-lines with one endpoint at 
z). Then 


Le = Le, = {2} 


and it follows that G is quasi-abelian. 
Lemma 1 now implies that Lg, ,, D Le,, so that 


Le = Le, = Les «a: 


It now follows that Ly = Le, = Le. 
The previous lemma and Theorem 3 imply the following. 


THEOREM 4. Let H be a finitely generated N-subgroup of a non-quasi-abelian 
group G. Then |G: H] is finite. 


Lemma 3. If U and V are subnormal subgroups of a non-quasi-abelian group, 
then U(\ V # {1}. 


Proof. Let F;, Ui, and V, be the orientation-preserving subgroups of index 
2 in F, U, and V respectively. In the proof of Lemma 2, we saw that if an 
N-subgroup of F is a reflection group or order 2, then F is quasi-abelian. 
Therefore neither U nor V are reflection groups, so U; and V, are non-trivial, 
subnormal subgroups of F;. There exist normal series 
Fi D F.5D...) F% = Ui 
_ , , r 
F,OKD...>h= Vi, 


, 


where some of the F, or some of the F,’ might coincide. We shall prove 
inductively that F, ©) F,’ isa non-trivial non-abelian group. If F, (\ F2’ = {1}, 











422 LEON GREENBERG 


then each element of F; commutes with each element of F;’ But two orienta- 
tion-preserving transformations commute, if and only if they have the same 
fixed points. This implies that F, (and F;’) is a commutative group. The 
elements of F; must be rotations with a common fixed point z; € D, limit- 
rotations with a common fixed point 22 € E, or translations with a common 
axis A. Since F: is normal in F;, the elements of F; must have the same in- 
variant point or A-line. Therefore F; is abelian, and F is quasi-abelian. From 
this contradiction we conclude that F,(\ F,’ # {lj}. Furthermore F; ()\ F,' 
is not abelian, since this together with its normality in F, would imply that 
F, is abelian. Now suppose that F, (\ F,’ # {1} and is non-abelian. Fy4, O\ F,’ 
and F, (\ F,4;' are normal subgroups of F, (\ F,;’. By the same argument as 
before, we conclude that FyiiO\ Fea’ = (Fiat O Fi’) O CFO Fear’) ¥ {1} 
and is not abelian. It now follows that U(\ V # {1}. 


THEOREM 5. Let H and K be two non-quasi-abelian subgroups of a discrete 
group. Then H and K are N-equivalent, if and only if there is a non-trivial 
subgroup J which is simultaneously subnormal in H and K. 


Proof. The “‘if’’ part is obvious; we shall prove the “‘only if’ part. There 
is an N-chain H = G,, Go,...,G, = K. The series 
Gi) Gi 1" G25 Gi G2: G:D...NG, 
k=l 
is a normal series. We shall prove inductively that 
() Gy 
k=1 


is a non-trivial, subnormal subgroup of G,,. This is certainly true for m = 1; 
assume that this is true for m = p. If G, is a normal subgroup of G,,,, then 
p+l D 
OG. = OG, ¥ {1} 
k=1 k=1 
Since 
p+l 


1) Gy 
k=1 


is subnormal in G,, which is normal in G,,;, it follows that 


is subnormal in G,,;. Now suppose that G,,; is a normal subgroup of G,. G, 
cannot be quasi-abelian. The conditions of Lemma 3 are fulfilled, with 


D 
F=G, U=(\G, V=Gr. 
k= 


Therefore 











en 








DISCRETE GROUPS OF MOTIONS 423 


p+ 


1) G, # {1}. 
k=l 


Since 


is subnormal in G,, 
Dp 
C) Gr ft) Go+1 
k=l 
is subnormal in G, (\ Gp41 = G,41. It now follows that the group 
n 
J= ft) Gy 
k=l 


is a non-trivial, subnormal subgroup of H and K. 
This theorem implies that if G is not quasi-abelian, then a subgroup H is 
an .\-subgroup if and only if it contains a subnormal subgroup of G. 


THEOREM 6. Let H be a finitely generated non-quasi-abelian subgroup of G. 
Then there is a subgroup Gy of G such that 

(a) Gy is N-equivalent to H, 

(b) if K C Gand K is N-equivalent to H, then K C Gu, 

(c) (Gq: H] is finite. 


Proof. Let Gy = {g\ g € G, gly = Ly}. Since 
H a Gu, Lu 388 Log: 


Lemma 1 implies that Ly D Leg, so that Ly = Ley. Theorem 3 now implies 
that [Gy: H] is finite. From this it follows that H has a finite number of 
conjugate subgroups in Gy. The intersection of these conjugate subgroups is 
a normal subgroup F of finite index in Gy. Since Gy is infinite, F is non-trivial. 
Therefore the sequence, Gy, F, H, is an N-chain, and Gy is N-equivalent to 
H. lf K is N-equivalent to H, Lemma 2 implies that K leaves Ly invariant, 
so that K C Gg. 


THEOREM 7. Let H and K be finitely generated non-quasi-abelian subgroups 
of a discrete group. Then the following statements are equivalent: 

(a) H and K are N-equivalent; 

(b) there is a group J which is simultaneously normal and of finite index in 
H and K; 

(c) Ly = Lx. 


Proof. If (a) is true, then Gy = Gg. (These are the groups introduced in 
Theorem 6.) Since H and K are of finite index in Gy, this is also true of 
H (\ K. Therefore H (\ K contains a nontrivial subgroup J which is normal 
and of finite index in Gg. J is also normal and of finite index in H and K. 
This shows that (a) implies (b). 











424 LEON GREENBERG 


If (b) is true, then H and K are N-equivalent. Therefore Ly = Lx. This 
shows that (b) implies (c). 

Now suppose (c) is true, Then Gy = Gg. It follows that H and K are both 
N-equivalent to Gg = Gx, and hence to each other. 

It would be interesting to determine whether there are algebraic conditions 
equivalent to the condition Ly = Lx, when H or K is infinitely generated. 

The following is proved in (9, p. 76). 


LEMMA 4. Let U and V be two groups such that the isometric circles of U 
are contained in Ry and the isometric circles of V are contained in Ry. Then 
the group generated by U and V is the free product U *V, and Ryszy = Rul \ Ry. 


THEOREM 8. Let H be a finitely generated subgroup of a finitely generated 
non-quasi-abelian group G. Then |G: H] is finite if and only if H is contained 
in no infinitely generated subgroup of G. 


Proof. \f H is of finite index, then any larger group must also be of finite 
index, and so it is finitely generated. 

Now suppose H is of infinite index. We shall find a subgroup of G which 
contains H and is infinitely generated. We may assume that G contains no 
limit-rotations. Then Ry, which cannot be contained in D, contains intervals 
on E. We first show that one of these intervals contains points of Lg in its 
interior. 

If Ly = Le, then by Theorem 3 [G: H] is finite. Thus there is a point 
zo © Le — Ly. Let z € Ry; there is a sequence {g,} C G such that 


lim gn(z) = 2o. 
n-400 

Since Ry is a fundamental region for H, there is h, € H so that hyag,(z) © Ry. 
The sequence {h,g,(z)} has a subsequence which converges to a point 
zi € Ry C(\E. The point z, is a limit point of G and belongs to an interval 
I, of Ry C\ E. 2, might possibly be an endpoint of J;. Since G is not quasi- 
abelian, Lg consists of more than two points, and hence it is a perfect subset 
of E (see (4, p. 68)). Thus there is a sequence {x,} in Lg which converges 
to z,;. Suppose this sequence is outside J,. 2; is the endpoint of an isometric 
circle of an element A € H. The transformation h maps J, outside Ry OE, 
and maps a neighbouring interval, containing almost all of the sequence {x,}, 
onto an interval J of Ry (\ E. Therefore I contains points of Lg in its interior. 

As is shown in (10), the fixed points of the translations of G are dense 
in Lg in the following sense. If x,x’ € Lg and J and J’ are intervals of E 
which contain x and x’ respectively, then there is a translation g € G, with 
a fixed point in each interval. A sufficiently high power g” has isometric 
circles } and \’ which intersect E inside J and J’ respectively. Since the 
interval J C Ry (\ E contains points in Lg, it contains an infinite sequence 
of such points, which we denote by { 91, v1’, v2, Yo’, . . . ,}. Let Jy, J,’ be mutually 
disjoint subintervals of J which contain y, and y,’ respectively. There is a 











is 





DISCRETE GROUPS OF MOTIONS 425 


translation g, © G whose isometric circles \, and i,’ intersect E inside J, and 
I, respectively. By Lemma 4, the group generated by {g;, go,...,} is a free 
group F of infinite rank, whose fundamental region R, is the region in D 
outside of all A, and A,’. Lemma 4 now implies that the group K generated 
by H and F is the free product H * F. Thus K is an infinitely generated group 
containing H. 


THEOREM 9. Let H be a finitely generated subgroup of a non-quasi-abelian 
group G. If H has a non-trivial intersection with every non-cyclic subgroup 
of G, then [G: H] is finite. 


Proof. We shall show that Ly = Lg. Since G is not quasi-abelian, Lg is a 
perfect subset of E. Let 2 € Le, and let J be an open interval of E which 
contains z. I contains infinitely many points of Lg. Choose four of them 
21, 21’, 22, 22. Let J, I;’, Is, Is’ be non-intersecting subintervals of J, which 
contain 2), 2;’, Z2, Ze’ respectively. As in the proof of Theorem 8, there are 
translations g; and ge € G, such that the isometric circles A, and A,’ of g: 
intersect E inside J, and J,’ respectively, and the isometric circles A», and ),.’ 
intersect E inside J, and J,’ respectively. The group K, generated by g; and 
go, is a free group of rank 2. By hypothesis, the intersection H (\ K is non- 
trivial. It follows that H has an element whose fixed points are in J. Since this 
is true for any interval J containing z, it follows that z © Ly and Ly = Leg. 
Theorem 3 now implies the required result. 


CoroLuary. Let H and K be finitely generated non-quasi-abelian subgroups 
of a discrete group. If H has a non-trivial intersection with every non-cyclic 


subgroup of K, then |K: H (\ K| 1s finite. 


Proof. By Theorem 2, H/)\K is finitely generated. The Corollary now 
follows from Theorem 9. 








426 


N 


S = 


9. 
10. 


11. 


12. 





LEON GREENBERG 


REFERENCES 


S. Bundgaard and J. Nielsen, On normal subgroups with finite index in F-groups, Mat. 
Tids. B (1951), 56-58. 


. H. S. M. Coxeter, On subgroups of the modular group, J. de Math. Pures et App. (1958), 


317-319. 
H. S. M. Coxeter and W. O. J. Moser, Generators and relations for discrete groups, Ergeb. 
der Math. (1957). 


. L. Ford, Automorphic functions (New York, 1951). 


R. Fox, On Fenchel’s conjecture about F-groups, Mat. Tids. B (1952), 61-65. 

K. Goldberg, Unimodular matrices or order 2 that commuie, J. Washington Acad. Sci., 46 
(1956), 337-338. 

A. G. Howson, On the intersection of finitely generated free groups, J. London Math. Soc., 
29 (1954), 428-434. 

A. Karrass and D. Solitar, Note on a theorem of Schreier, Proc. Amer. Math. Soc., 8 (1957), 
696-697. 

W. Magnus, Discrete groups (New York University Notes, 1952). 

J. Nielsen, Ueber Gruppen linearer Transformationen, Mitteilungen der Math. Ges. in 
Hamburg Band VIII (1940), 82-104. 

Kommutatorgruppen for det frie product af cykliske grupper, Mat. Tids. B (1948), 

49-56. 





——— Nogle grundlaeggende begreber vedrérende diskontinuerte grupper af lineaere substitu- 


tioner i en kompleks variabel, Den Il*° Skandinaviske Matematikerkongress i Trondheim 
(1949), 61-70. 


Brown University 

















H 


cr 

















LIMITS OF LATTICES IN A COMPACTLY 
GENERATED GROUP 


A. M. MACBEATH anp S. SWIERCZKOWSKI 


1. Introduction. Let G be a locally compact and o-compact! topological 
group and let H be a discrete subgroup of G.* We shall use G/H to denote the 
space of right cosets Hx of H with the usual topology (cf. (8, pp. 26-28)). Let 
u be the left Haar measure in G. uw induces a measure in the space G/H;' this 
measure will, without ambiguity in this paper, also be denoted by u. If u(G/H) 
is finite, the group H is called a Jattice. If the space G/H is compact, then H 
is certainly a lattice and is called a bounded lattice. These terms are an extension 
of the usage of the Geometry of Numbers, where G is the real n-dimensional 
vector space R". In this case any lattice is generated by n linearly independent 
vectors, all lattices are bounded, and the whole family of lattices is permuted 
transitively by the automorphisms of G (which are the non-singular linear 
transformations). The constant u(G/H) is called the determinant of H in this 
case. The family of all lattices in Euclidean space forms a locally compact 
topological space. In (7) Mahler proved the following 


SELECTION THEOREM. Let | H,,} be a sequence of lattices in R" with the following 
properties 

(i) There is a neighbourhood V of the zero-vector e such that, for all n, 
H,‘\ V = {e}, 

(ii) u«(G/H,) is bounded above. 
Then there exists a subsequence |H,,} of |H,} which converges to a lattice H. 


Let now G, H, and u be as in the beginning. Mahler’s theorem suggests two 
definitions. [Notation: e is the unity of G, N the class of open sets containing 
e: K,{rK are closure and boundary of K; U, —, (\ denote the set union, 
difference, intersection. We use ¢: G->G/H for the natural mapping ¢(x) = Hx.] 


DEFINITION 1. A sequence {H,} of subgroups of G is called uniformly dis- 
crete if H, (\ V = {e} for a certain V € N and all n. 


DEFINITION 2. A sequence {H,} of subgroups of G converges to a subgroup H 
if, given any compact set C and any V € N, 


H(\CCH,V and H,(\CCHV 


holds for all but a finite number of n. 


Received April 27, 1959. 

!That is, G is a countable union of compact sets. 

*As is well known, this implies that H is countable. 

*We give a precise definition of the induced measure in § 3. 


427 











428 A. M. MACBEATH AND S. SWIERCZKOWSKI 


Chabauty (2) has generalized Mahler’s theorem by showing that a uniform- 
ly discrete sequence {H,} of subgroups of G has a subsequence converging to 
a discrete subgroup H and moreover 





(1) u(G/H) < lim inf u(G/H,) 


so that H is a lattice if all the H, are and u(G/H,) is bounded. 
In the classical case G = R", it is of course easy to show that 


(2) u(G/H) = lim u(G/H,) 


and Chabauty has shown that in certain circumstances this is true also for 
topological groups G. In this paper we make a further contribution to this 
problem by proving that, if H is a bounded lattice, then a necessary and 
sufficient condition for (2) to hold is that G should be compactly generated‘ 
or that H should be finitely generated. We shall give an example due to 
M. Kneser showing that the boundedness of H is essential. Thus it might 
seem better to consider bounded lattices only, particularly since in Geometry 
of Numbers all lattices are bounded. Unfortunately however, a lattice which 
is a limit of bounded lattices need not be bounded. In § 6 we shall give an 
example of such a lattice where G is a homomorphic image of the group of 
2 by 2 matrices with determinant unity. } 


2. Fundamental domain. As in (5) and (10) a Borel set P will be called 
a packing if P(\hP = @ for e # h © H and a Borel set C will be called a 
covering if HC = G. F is called a fundamental domain if it is both a packing 
and a covering. In cases of ambiguity we may refer to an H-packing, H- 
covering, or H-fundamental domain. 

In this section we show, extending a result of Chabauty (1), and Siegel 
(9), that there is a fundamental domain F with u(frF) = 0, and also that if 
G/H is compact then there is such a fundamental domain with compact 


closure F. 





We shall overlap in places with Chabauty’s results. We start with a lemma 
which shows that Chabauty’s axiom (MM) is always satisfied. 


LemMA 2.1. If C is compact and U is open, C C U, then there is a Baire 
measurable open set V such that | 
CCVCU, u(frV) = 0. 


In particular, taking C = {e}, there is a fundamental system of neighbourhoods 
of the identity each of which has a frontier of measure 0. 


Proof. Since the measure yu is regular, and the measure of any compact 
set is finite (4, §§ 64 and 52), we may assume, on replacing U by an open 
subset, if necessary, that u(U) < @. Since the group G is a completely 
regular space (8, p. 29), a continuous function f(x) exists such that f(x) = 0 


‘That is, have a compact set of generators. 








ima 











LATTICES IN A COMPACTLY GENERATED GROUP 429 


for x € C, f(x) = 1 for x ¢ U. Let E(r) = {x: f(x) < r}. The function u(E(r)) 
is a monotonic function of the real variable r, and therefore has at most 
countably many discontinuities. Let r> be a value at which it is continuous. 
Then 

E(ro) = 1\ {(W: W open, W D E(ro)} C 1 E(r). 


Hence 
u(E(ro)) < w(E(ro)) < lim p(E(r)) = w(E(r0)). 
This completes the proof, since V = E(ro) is a Baire set. 
The following two lemmas are easily verified: 


LeMMA 2.2. If A, B are packings and C = (A — HB) UB, then C is a 
packing and HC = HA — HB. 


LEMMA 2.3. 
Bf\frA C fr(A (\ B) U frB. 


We begin now the construction of a fundamental domain. Our final result 
will be as follows. 


THEOREM 1. There is a fundamental domain F such that 
(i) w(frF) = 0; 
(ii) If G/H is compact, there exists a fundamental domain F satisfying (i) 


such that also F is compact. 
The proof of this Theorem is closely modelled on that of Siegel (9). 


Proof. Since G is locally compact and H is discrete, we can, by Lemma 2.1, 
choose V € N so that u(frV) = 0, V is compact, and V is an H-packing. 
Since G is e-compact,G C Vx, for some sequence {x,} C G. Define F; = Vx, 
F, = Vx, — H(Vx,U...U Vxy_1). Let F = U F,. Then clearly F is an 
H-packing, since F, is and since F,, (\hF, = ¢. Also HF = G, for if g € G, 
there is a least integer m such that g € HVx, and then g € HF, C HF. Thus 
F is a fundamental domain. 

To show that u(frF) = 0, set C, = Vx,;U...U Vx,_1. Then frC, C frVx, 
U...U frVx,1, so u(frC,) = 0. Also 

F, = Vx, — HC, = Vx, — U (hC, 1) Vxa). 
hen 
If empty terms are dropped from the last union, only those # remain for 
which h € Vx,C,—'. Since Vx,C,—! is a bounded set, the number of 4 is finite, 
say h,,...h,, and we have 


F, = Vxq — (mC, U...U4,C,) 


u(frF,) < w(irVx,) + >> w(frh.C,) = 0. 
i=] 











430 A. M. MACBEATH AND S. SWIERCZKOWSKI 


By Lemma 2.3, Vx, C\frF C fr( Vx, C\ F) U frVx, = frF, C\ frVx,. Thus 
u(frF) < Sow( Vx, O\frF) = 0. 

In the case when G/H is compact, G/H can be covered by a finite union 
¢(Vx,)U...Ue(Vx,), so F= FiyU...UF, will be a fundamental 
domain. Since F is then contained in the bounded set C,,;, it is itself bounded. 
This completes the proof of Theorem 1. 


We conclude this section with a slightly more precise form of the statement 
of Theorem 1. This is required for a later application. 


LEMMA 2.4. If S is any covering, then there is a fundamental domain contained 
in S. 


Proof. Let F be any fundamental domain. We have F C HS. Thus F is a 
union of A-translates of subsets of S and therefore F is also a disjoint union 
of h-translates of subsets of S, say F = h,S,; U hoS,\U... . It is obvious that 
Fy = S,U S:U... is a fundamental domain contained in S. 


3. The induced measure in G/H. Since we regard the group H as a 
group of permutations acting on G by left translation, it follows that each 
H-orbit is a right coset Hx. This is why we use G/H for the space of right 
cosets, instead of the more usual homogeneous space of left cosets. On the 
space G/H the group G acts transitively by right translation. If A(x) is the 
real-valued function defined on G by the relation u(Ex) = A(x) - u(Z2), then 
it follows from the criterion in (11, p. 45) that there is a measure Z on Borel 
subsets E of G/H such that a(Ex) = a(£) - A(x). For our purposes it is more 
convenient to define uw directly from the natural mapping ¢:G— G/H, 
(¢(x) = Hx), as follows: If F is any fundamental domain, define 

a(E) = w(e(E) AO F). 

It follows from (5, Theorem 1, Corollary), applied to the measure space 
¢'(E) and the group H of transformations of this space, that this expression 
does not depend on the particular fundamental domain chosen. We shall, 
for S C G, use S/H to denote ¢(S) and we shall write yu for Z. 

We conclude this section with three lemmas which will be useful later. 

Before stating the first lemma, we note that if G; is any open subgroup 
of G, the same measure yz, but with its domain of definition restricted to G; 
will serve as a Haar measure on G,. 


LemMa 3.1. If G, is an open subgroup of G and H, = H, (\ H, then w(G,/H,) 
= u(G,/H). 


Proof. Let F, be a fundamental domain for H, in G;. Then hF, (\ F,; # 4, 
h € H, implies A € F\F,;-' C Gy, so kh € Hy, and h = e. Thus Ff, is an H- 
packing. If F is a fundamental domain for H in G, then, so is F* = F, U(F 
— HF,), by Lemma 2.2. By our definition of induced measure, 


u(Gi/H) = w(F* (\ Gi) = w(Fi) = »(Gi/M)). 





i> oh es ek 

















1S 

















LATTICES IN A COMPACTLY GENERATED GROUP 431 


LemMA 3.2. If H, C H and H:H, denotes the index of H, in H, then we 
have 
u(G/H,) = (HM: Hi)p(G/H). 
Proof. Let F be an H-fundamental domain and let X be a complete system 
of representatives of left cosets of H, in H. One checks that XF is an A;- 
fundamental domain and our result follows then since X = H: Hy. 


LemMMA 3.3. If H C Gi, where G, is an open subgroup of G, then 
u(G/H) = (G: Gi)u(Gi/H). 


Proof. If G, is not unimodular, neither is G and u(G/H) = u(G,/H) = @ 
(9, Lemma 5). Suppose next that G, is unimodular, but not G. Then, since 
G, is open, u is also the Haar measure for G; and we have A(x) = 1 for x € G. 
However, if A(x) # 1, where x € G, then A(x") # 1 for each natural n. All 
the elements x” must then belong to different left cosets of G,; and hence 
G:G, = . Again both sides are infinite. 

The remaining case to consider is when G is unimodular. Then, if F is an 
H-fundamental domain for G, and X is a complete system of representatives 
of right cosets of G,, we verify that FX is an H-fundamental domain for 


G. Since G is unimodular our result follows from X = G: G,. 
LEMMA 3.4. If G, is an open subgroup of G, then 
u(G/H) < (G: Gi)u(G./A). 
Proof. Let H, = G,(\ H. By Lemmas 3.1, 3.2, and 3.3 we have 
u(G/H,) = (G: Gy)u(Gi/H,) = (G: G,)u(Gi/), 
u(G/H;j) (H: H,)u(G/H) > w(G/H). 


Il 


This proves the lemma. 


LemMaA 3.5. If K is an open subgroup of G and HK is also a subgroup, then 
u(HK/A) = wp(K/K OVA). 


Proof. Let F be a (K (\ H)-fundamental domain for the group K. One 
checks that F is an H-fundamental domain for HK. 


4. Limits of discrete subgroups. In this section we assume G/H compact. 
We consider the following two closely related questions: 

I. In what groups G does the relation lim H, = H imply lim «(G/H,) 
= u(G/H) for any uniformly discrete sequence of subgroups { H,}? 

Il. Under what circumstances does lim H, = H imply lim u(G/H,) = 
u(G/H) if |H,} is restricted to be a uniformly discrete sequence of /attices? 

Our answer to I is complete, given by the theorem below. As to question I] 
we give a little extra information in Theorem 3. Another kind of answer was 
found by Chabauty and we present in § 5 an alternative proof of his result 
(our Theorem 4). 











432 A. M. MACBEATH AND S. SWIERCZKOWSKI 


THEOREM 2. The following four statements are equivalent: 
(i) G ts compactly generated. 
(ii) H is finitely generated. 
(iii) Jf {H,} is a sequence of discrete subgroups, lim H, = H, then 
lim sup u(G/H,) < w(G/HA). 
(iv) If {H,} ts a uniformly discrete sequence of subgroups, lim H, = H, then 
lim »(G/H,) = u(G/H). 


Proof. We have proved in a recent paper that (i) implies (ii) (see 6). Sup- 
pose (ii) holds. It follows from Theorem | (ii) that there exists an H-funda- 
mental domain F with compact closure F. If T is the finite set of generators 
of H, then the compact set l U F is obviously a set of generators of G. Hence 
(ii) implies (i) and so (i) and (ii) are equivalent. By Chabauty’s inequality 
(1), (iii) implies (iv). Thus it remains to prove that (iv) implies (ii) and 
that (i) implies (iii). 


Proof that (iv) implies (ii). Suppose that (ii) is false. Then H being count- 


able let its elements be enumerated hy, he, ..., and let H, be the subgroup 
generated by the elements /,,..., h,. If C is a compact set, C ()\ H is finite 
and if mp is the largest value of r for which h, lies in C, we have C(\ H = C/OH, 
for n > mo. Thus lim H, = H. However, the index H:.H, is infinite, otherwise 
H would have a finite system of generators given by hy, ... , h, together with 
a complete system of representatives of the H,,-cosets. It follows from Lemma 
3.2 that u(G/H,) = @ for all n. But u(G/H) < @, so (iv) is false. 


Proof that (i) implies (iii). Let F be an H-fundamental domain with compact 
closure F’, such that u(frF) = 0. We have u(G/H) = u(F) = u(F). Let « > 0. 
We have to show that, for sufficiently large n, u(G/H,) < u(G/H) + «. Choose 


V € N, V compact, so that 
(3) u(VF) < w(F) + «. 


Let D be a compact system of generators of G. Replacing D by D U D-", if 
necessary, we may assume that 


(4) U D‘ =G. 
1 
The set VFDF-} is compact, so there is a finite number of elements hy, .. . , h, 
of H in it. We have VFD C HF; but hF (\ VFD = ¢ unless h € VFDF-', 
that is, unless # is one of the elements /;,..., h,. It follows that 
(5) VFD ChFU...UA,F. 


Since lim H, = H, there is a number mo, such that, for ” > mo, H,V con- 
tains each of the elements /;,...,h,, and hence from (5), VFD C H,VF. 
But H, is a subgroup, H, = H,* for each integer k, and thus 


VFD‘ CH,VFD*"C...C H®"VFD C H'VF = H, VF. 











ol 














LATTICES IN A COMPACTLY GENERATED GROUP 433 


Thus G = H,VF by (4). Hence VF is an H,-covering and by the theorem 
on packings and coverings in (5) it follows from (3) that 


u(G/H,) < u( VF) < w(G/A) + «. 


To state our next theorem briefly, it is convenient to have another definition. 
A pair (G, H) consisting of a locally compact o-compact group G and a discrete 
subgroup H with G/H compact will be called a tractable pair if the following 
condition holds. Given any uniformly discrete sequence {H,} of lattices in G 


such that lim H, = H, then lim u(G/H,) = u(G/H). 


THEOREM 3. If G contains an open compactly generated subgroup K such 
that for he H 


(6) hKh" = K 
then (G, H) 1s tractable if and only if (H, H) ts tractable. 


Proof. It is quite clear that if (H, H) is not tractable, then (G, H) is not 
tractable. For there will be a sequence {Q,} of subgroups of H/ of finite index 
such that lim Q, = H, but H:Q, > 1 for infinitely many n. By Lemma 3.2, 
u(G/Q,) = (H: Q,)u(G/H) > 2u(G/H) for infinitely many n. Thus (G, #) is 
not tractable. 

We now assume therefore, that (7, H) is tractable and our aim is to 
prove that (G, H) is tractable. We shall show that if {H,} is a sequence of 
lattices in G and lim H, = H, then 


(7) lim sup 4(G/H,) < u(G/H). 


Hence for a uniformly discrete sequence {H,} of lattices we have by (1), 
lim u(G/H,) = u(G/H), that is, (G, H) is tractable. 

Since the topology in H is discrete, our assumption that (17, #2) is tractable 
means that, if {Q,} is a sequence of subgroups of H with the following properties: 


(8) () H=UNQ,, 
m=l1n=m 
(ii) H:Q, < &, 


then there is a number mp such that H = Q, for n > no. 

Suppose now that {H,} is a sequence of lattices in G such that lim H, = H. 
To show (7) we shall associate with the sequence {H,} a sequence {Q,} of 
subgroups of H which satisfies the conditions (8). We observe first that, by 
(6), HK and 

M, = H, (\ HK, P, = M,K, QO, = H,K (\H 


are subgroups of G and moreover P, is open. 


LemMA 4.1. H:Q, = HK: P,. 











434 A. M. MACBEATH AND S. SWIERCZKOWSKI 


Proof. One checks easily that any complete system of representatives of 
left cosets of Q, in H is also a complete system of representatives of left 
cosets of P, in HK. 


LEMMA 4.2. For n > mo, Q, = H, P, = HK. 


Proof. By Lemma 4.1 it is enough to show that Q, = H. Since (H, H) is 
tractable this follows if we show that conditions (8) are satisfied. To prove 
that Q, has finite index, we note that, by Lemmas 3.1 and 3.3, 


(HK: P,)u(P,/M,) = »(HK/M,) = u(HK/H,) < u(G/H,) < @. 
Now P, is for sufficiently large m a non-empty open set, so u(P,/M,) > 0, 
and by Lemma 4.1, H,:Q, = HK: P, < @. 

To show that (8) (i) holds we have to show that if ”# € H, then, for suffi- 
ciently large n, h € Q,. To see this we note that K € N, so for sufficiently 
large n, hK (\ H, # ¢, that is, h © H,K. This proves our lemma. 

We are now in a position to prove (7). By Theorem 2, since K is compactly 
generated 
(9) lim sup n(K/K (\ H,) < w(K/K 1" A). 

From Lemma 3.5, we have u(HK/H) = u(K/K (\ AB). If, in Lemma 3.5 we 
replace H by M, so that HK is replaced by P,, we find that u(P,/M,) = u 
(K/K (\H,). From Lemma 3.1, we have u(P,/H,) = u(P,/M,) since 
P, (\ H, = M,. Hence u(P,/H,) = u(K/K (\H,) and substituting in (9) 
we derive 

(10) lim sup u(P,/H,) < »(HK/#). 

For sufficiently large » we have, by Lemma 4.2, 

(11) u(HK/H,) = w(?P,/HA,). 

Using (10), (11), and Lemmas 3.3 and 3.4, 


u(G/H) = (G: HK)p(AHK/H) > (G: HK) lim sup u(P,/A,) 
= (G: HK) lim sup n(HK/H,) > lim sup u(G/H,). 
This completes our proof. 
5. A result of Chabauty. We shall give now an alternative proof of a 


theorem of Chabauty (1) which combined with (1) yields another kind of 
answer to our question II. 


THEOREM 4. If {H,} is a sequence of lattices, lim H, = H and there exists a 
set S of finite measure which is an H,-covering for each n, then 


lim sup «(G/H,) < u(G/H). 
Proof. Let F, F, denote the H and H,-fundamental domains so that 


u(G/H,) = w(F,), u(G/H) = p(F). 











we 





LATTICES IN A COMPACTLY GENERATED GROUP 435 


By Lemma 2.4 we may assume F, C S. From S C HF follows that we can 
cover S, except for a set of arbitrarily small measure, by a finite union 
hFWU...UAaF, hy € H. Since H = lim H, it follows that these sets in 
turn can be approximated by unions 


ari)... ts AP, where Ae” € H.. 
Therefore, for sufficiently large m, an arbitrarily small part of S remains 
uncovered by H,,F. Hence, by F, C S, we have lim [u(F,) — «(2 O A, F))=0. 
Since 


uF, VHF) = (UU (ORF) < Ew ObF) = 2 ue 'F A F) 
Hn n 


Ha 
= u(H,F, (0) F) = w(F) 


the theorem follows. 


6. Examples. In this section we give three examples illustrating different 
possible properties of convergent sequences of discrete subgroups. 


Example 1. It follows from Theorem 2 that, if G is compactly generated, 
G/H, compact and lim H, = H, then lim sup u(G/H,) < u(G/H). To show 
that this need not be true if G is not compactly generated, take G = H = G, 
X GoX...XG, X..., the weak direct product of a countable family of 
cyclic groups of order 2, with the discrete topology. Define H, to be the set 
of all g = (g1,g2,..-,80---,)€G with g,=e. Then ys(G/H,) = 2, 
lim H, = H, w(G/H) = 1. 


Example 2. In this example G is a connected Lie group, and G/H, is compact 
for each n, but G/H is not compact. Let G be the group of all linear trans- 
formations 


az +b 

cze+d’ 
where w, z are complex variables, a, 6, c, d are real and ad — bc > 0. In 
addition to G we consider the set G,; of inversions, that is, transformations 
of the form 


. {22 + ) 
— (&@+d)' 

where Z is the complex conjugate, and a, b, c, d are real with ad — bc < 0. 

The set G LU G, is a group of transformations of the upper half-plane Nz > 0 

on itself, and G is a normal subgroup of index 2. The topology is the natural 

one obtained from the variables a, }, c, d. 

Let P be the point i = +~/— 1, and let Q = ki(1 < k <~+/3) bea variable 
point on the imaginary axis. Let C(Q) be the circle through Q with centre 
on the positive real axis and cutting the imaginary axis at an angle 41. Let 
C(Q) cut |z| = 1 in R and consider the curved triangle PQR, made up of 











436 A. M. MACBEATH AND S. SWIERCZKOWSKI 


part of the imaginary axis and parts of the circles. As k varies between 1 
and +/3, the angle at R will decrease continuously from $r to 0. Thus there 
will be a sequence of points Q7, Qs, Qs, ... , and corresponding points R;, Rs, 
Ry,..., such that the angles at R take the values 4x, $x, $7... . 

It is easy to see that the subgroup K, of G, LU G, generated by the opera- 
tions of inversion in the circles PR,, Q,R, and reflection in the line PQ, is a 
discrete subgroup of G\U G;. Let K,(\G = H,. Regarded as a group of 
transformations of the complex plane, it has as a fundamental domain the 
interior of the curved triangle PQ,R,, the reflection of this triangle in the 
line PQ,, together with some of the boundary points of this region. It is one 
of the triangle groups well known in the theory of automorphic functions 
(2; 3). 

A H,-fundamental domain in G is the set of all mappings ¢ of G such that 
tP lies in the fundamental domain in the z-plane just described. For each n, 
the closure of the triangle PQ,R, lies in the interior of the upper half-plane, 
so G/H, is compact. 

The limit H of the sequence H, has a fundamental domain which is obtained 
in the same way from the triangle PQ,.R.., where Q,, = iv/3, R.. = — 1, and 
the R-angle of the curved triangle is zero. However, G/H is not compact 
because the closure of its fundamental domain contains the point R,, which 
is a boundary point of the upper half plane, and is not equal to ¢P for any 


té€ G. 


Example 3. This example indicates that the conclusions of Theorem 2 cease 
to be true if G/H is not compact, even when G is connected and H finitely 
generated. The example was suggested to us in conversation by Professor 
Martin Kneser, and we are grateful to him for permission to include it here. 

Let P, Q, R, S be four points on the real axis in the order indicated. Consider 
the operations ¢), ts, ts, tg of inversion in the circles on diameters SP, PQ, 
QR, RS. These generate a discrete subgroup H of G\U G, which is a free 
product of four cyclic groups of order 2. Its fundamental domain in the 
upper half plane is the interior of the curved quadrilateral PQRS. Keep 
P, Q, S fixed and let R pass through a sequence of points tending to S. The 
group H will tend to a limit H,, which is generated by inversions in the 
circles SP, PQ, QS. The fundamental domain in the half-plane is the tri- 
angle PQS. 

Now in the hyperbolic plane, the area of triangles with zero angles is a 
constant. Since the quadrilateral PQRS is a union of two such triangles, its 
area is twice the area of the triangle PQS. Returning to the original group- 
space, we deduce without difficulty that 


u(G/H (\G) = 2u(G/H,, \G). 














LATTICES IN A COMPACTLY GENERATED GROUP 437 


REFERENCES 


1. C. Chabauty, Limite d'’ensembles et geometrie des nombres, Bull. Soc. Math. France, 78 
(1950), 143-151 
2. L. R. Ford, Automorphic functions (New York: Chelsea, 1951). 
3. R. Fricke and F. Klein, Vorlesungen ueber die Theorie der Automorphen Funktionen (Leip- 
zig: Teubner, 1897-1912). 
4. P. R. Halmos, Measure theory (New York: Van Nostrand, 1950). 
5. A. M. Macbeath, Abstract theory of packings and coverings, I (to appear in Proc. Glasgow 
Math. Assoc.). 
6. A. M. Macbeath and S. Swierczkowski, On the set of generators of a subgroup, Indag. Math., 
21 (1959), 280-281. 
. K. Mahler, On lattice points in n-dimensional star bodies. I, Existence Theorems, Proc. Roy. 
Soc. London, Ser. A.187 (1946), 151-187. 
8. D. Montgomery and L. Zippin, Topological transformation groups (New York: Interscience 
tracts, 1955). 
9. C. L. Siegel, Discontinuous groups, Ann. Math., 44 (1943), 674-678. 
10. S. Swierczkowski, Abstract theory of packings and coverings, II (to appear in Proc. Glasgow 
Math. Assoc.). 
11. A. Weil, L’integration dans les groupes topologiques et ses applications (Paris: Hermann, 
1951). 


“I 


Queen's College 
Dundee, Scotland 











CANONICAL FORMS FOR CERTAIN MATRICES 
UNDER UNITARY CONGRUENCE 


J] W. STANDER anp N. A. WIEGMANN 


1. Introduction. If A is a matrix with complex elements and if A = AT 
(where A* denotes the transpose of A), there exists a non-singular matrix 
P such that PAP™ = D is a diagonal matrix (see (3), for example). It is 
also true (see the principal result of (5)) that for such an A there exists a 
unitary matrix U such that UA UT = D is a real diagonal matrix with non- 
negative elements which is a canonical form for A relative to the given U’,UT 
transformation. If A = —A™, it is known (see (3) or (4)) that there exists 
a non-singular matrix P such that PAP is a direct sum of a zero matrix (if 
present) and of 2X2 blocks of the form: 


[-t ol: 


The present work is concerned with the following. First, a canonical form 
is obtained for a complex skew-symmetric matrix under a U,U*™ transformation 
where U is a unitary complex matrix; this form is analogous to that of the 
symmetric matrix mentioned above. Thereafter, matrices with real quaternion 
elements are considered. For such an A the *-transpose (denoted by A*) is 
defined and is seen to be a generalization of the transpose (of a complex 
matrix) for the non-commutative case which at the same time retains the 
properties of the ordinary transpose in the commutative case. Quaternion 
matrices of the form A = A* and A = —A* are considered, in turn, and 
results analogous to those mentioned above for complex matrices are ob- 
tained which justify this generalization. 


2. A normal form for a complex symmetric matrix under unitary 
congruence. To obtain this form the following is employed: 


LemMA 1. Jf A is a complex, unitary, skew-symmetric matrix there exists a 
complex unitary matrix U such that UA U* = E is a direct sum of 2 X 2 matrices 


| | 
—] 0 ' 


It is evident that A must be of even order since it is skew-symmetric and 
non-singular. Let A = A,+i Ae, where A; and A: are real matrices, so that 


Ay = —A,* and Az:= —A,". Since AAC = (A,+1A2)(A,™—iA2") ~ i. 


Received April 23, 1959. 
438 








Vv 














CANONICAL FORMS UNDER UNITARY CONGRUENCE 439 


it follows that A,;A,;'+A2A_™ = J and A2A," = A;A;". The latter becomes 
A»A,; = A;A>. By a known theorem (see (2), for example), there exists a real 
orthogonal matrix JT such that 74,77 = E, and TA.,7™ = E, are direct sums 
of zeros and 2X2 matrices of the form 


( | 0 a 
) —a 0 


where a > 0 is real. Furthermore, it can be shown that, as in the present 
case, when A, and 4, are both skew-symmetric, £; and E, can be regarded 
as conformable direct sums of 2X2 matrices of the above form, of 22 zero 
matrices, and of 1X1 zero matrices in such a way that whenever a single 
zero element appears in the direct sum of one, it appears in the same diagonal 
position in the other. (A 2X2 matrix of form (i) in one can correspond to a 
2X2 zero matrix in the other, of course.) This may be seen as follows: 

The statement is true or there is a first block (in EZ; or E») in the direct sum 
where it is not true; this would mean that there would be corresponding 3X3 
diagonal blocks in E, and Eo», respectively, of the form 


0 0 0 0 a 0 
0 0 b —a 0 0 
0 —b 0 0 0 0 
where a * 0 and 6 # 0. But since A224; = A,Azo, the above matrices must 


commute and they do not. Hence E,; and Ez can be considered to be direct 
sums which are conformable as described above. 

Therefore 7(4,;+7A2)7* = E, +1 £2 which is unitary (and non-singular) ; 
consequently, no 1X1 zero element can appear alone along the diagonal of 
E, and Ez in the form described for each in the preceding paragraph. There- 
fore, E; and E, are each direct sums of 2X2 matrices of form (i) where a 2 0, 
so that £,+i Ez is a direct sum of 2X2 blocks of the form 


= 0 a 
(11) Eo = £ “| 


where a is non-zero complex. Since E,+i E2 is unitary, a& = 1. Let a = e” 
and form the 2X2 unitary matrix 


0 cr 
v=| oon 0 |. 
Then VE,V"* is a matrix of the form 
[ta 
—1 0J° 


If S is an appropriate direct sum of such V (determined from each 2X2 
matrix in the direct sum E,+i E,), then ST(A,+i A2)7T*S™ = E, the direct 











440 J. W. STANDER AND N. A. WIEGMANN 


sum as described in the statement of the lemma, where U = ST is a complex 
unitary matrix. 


THEOREM |. Jf A is a complex skew-symmetric matrix, there exists a complex 
unitary matrix V such that VAV* = E+0 where E is a direct sum of 2X2 


matrices of the form 
| 0 4 
—a 0)’ 


where a > 0 is real; and conversely. 

Let 4 = HU = UK #0 be a polar representation of A where H and K 
are hermitian and U is unitary. (It may be noted that each a > 0 described 
in the statement of the theorem is actually a characteristic root of H or K). 
Since A = HU = UK = —A?t™ = —U*TH® = —KTU’', and since the hermi- 
tian polar matrix H is unique, it follows from A = HU = —K*TU*™ that H = 
K*™ or H =—K®™ (since —K*U" is also a polar form of A). But since K is 
positive definite, K* is also, and H = —K™ cannot hold (since H would 
not be positive definite). Therefore H = K*. 

If A, skew-symmetric, is non-singular, it must be of even order; in any 
event, the rank of A is even. If A = HU, the rank of A = the rank of H =r, 
an even number. 

For H = K™ let V, be a complex unitary matrix such that V;HV,°T = D = 
Do+0 (where 0 is absent if B is non-singular) where Dy = Di4-Det... +Du 
where D, = d,J,; is a real diagonal scalar matrix, d; # d, for i # j, and 


d,>d.>...>d,>0. If A is non-singular, it is known (see (9)) that the 
polar representation is unique, so that 4 = HU = K*(—U") implies that 
U = —U*. If A is singular, this need not be true (8); as a matter of fact, it 


cannot be true if A is of odd order since U is non-singular. 

Consider the case where A = HU is singular. Let V,UV,°T = W and 
Vi(—U*T)ViS?F = Wi; also let ViKV,°T = V,AHTV,ST = M. Then from 
V,AV,S? = V,HUV,S = V,UKV,% = V,(— UTH*) VS 

= V,(—KTU")V;,°" 

it follows that V,;A V,;°T = DW = WM = W,M = DW,. From WM = W,M 
it follows, in turn, that 

W(V,H' V;,°*) 


W,( VHA" V,%*), 
or 

WV,ViTDV,° VST = WiViViT DVS V;°*, 
so that WV,V,TD = W,V,V;'D. Since DW = DW, (and since D has rank 
r), W and W, have like first r rows, and so WV,V," and W,V;V;" also have 
like first r rows; and from the last result in the preceding, WV,V," and 
W,V,V;" also have like first r columns. Let WV,V," be of the form 


§ 11 Au] 
Ax xX 








| 





-_ ——_ non ae 





tN 


ik 
ve 
id 











CANONICAL FORMS UNDER UNITARY CONGRUENCE 441 


where Ay, is an r Xr matrix. Since DW = WM = W,V;,V;TDV,°V;,*, 
therefore DWV,V,* = W,V,V,"D. From this relation it follows, after equating 
corresponding elements and noting that W,V,V," is of the same form as 
WV,V," except for X, that Ay: and A» are zero matrices. Then: 


WV1V;" = An +X, WiViV;" = Ay + Y, 
W = (A utX)Vi°OV;" -_ V,UV;", W, _ (An+ Y)V;°V;,°" = V;(—U") yo. 
U= Vi (An + X)Vi°, —U* = Vi" (An + Y)V;°. 


Therefore, UT = V,°T(Ay™ + XT) Vi° = ViST(—Ayi + [— Y])Vi° and so 
Ay, = —A,," and A,, must also be unitary (since UT is) and Y = —X™ where 
X is unitary but otherwise arbitrary. So V;UV," = Ay, + X and V,;(—U*)V;" 
= Ay + Y. 


Then V,AV,T = V,HV,°TV,UV," = V;(—U™)V;TV;°H™V;T = (D,+0)- 
(Au +X) = (An + Y)(Do +0). This means that V;AV,T = DoAy, + 0 
where DoAy,; = Ay,Do is of (even) order r, and A, is unitary and skew-sym- 
metric. It follows that Ay; = A; + Ap +... + A,, where A, is of the order 
of D,in Do = Di + D.+...4+ D,, and that each A, is unitary and skew- 
symmetric and hence of even order. From the lemma for each A , there exists 
a complex unitary U;, such that U,A,U;,* is a direct sum of the 2 X 2 matrices 
described in the lemma. If U = U,; +... + U;, then UV,AV,TUT = Doky 
+ 0 where Ey is a direct sum of 2 X 2 matrices of the form described in the 
lemma. Then DoE» is the matrix E described in the theorem, and since UV, is 
unitary, the theorem has been obtained. If A is non-singular, the same proof 
holds and D = Do, U = V,9TA1,Vi1°, etc., and 0 does not appear in the final 
form E + 0. 


The converse is immediate. 


3. A normal form for a *-symmetric quaternion matrix under 
unitary congruence. If two matrices A and B have elements which lie 
in a non-commutative domain, among the properties of the transpose which 
do not hold (as they do in the commutative case) is that (AB)? = BTA™. 
If a matrix A with real quaternion elements is written in the form A = A,+j A2 
(where A, and Az are complex matrices), then AT = A,* + 7 A2*. Also, by 
the conjugate transpose of A is meant the matrix AST = A,°? + (j A)? = 
A,°* — j A,™ (where A,°? denotes the complex conjugate transpose of A). 

If the *-transpose of the matrix A is defined to be the matrix A* = A,* + 
A»"j, it is seen that this includes the ordinary transpose of a complex matrix 
as a special case. Among the properties of the *-transpose which can easily 
be verified are the following: (A*)* = A; A* = ij A°Tji; (A + B)* = A* 
+ B*; (AB)* = B*A*; if A is non-singular, (A*)—' = (A-')*; (A*)o = 
(A°©T)*. Define A to be *-symmetric if A = A*, and to be *-skew-symmetric if 
A = —A*. In the following, canonical forms are found for such matrices 











442 J. W. STANDER AND N. A. WIEGMANN 


under unitary congruence which are clearly generalizations of the theorems 
for the complex case stated in the two preceding sections. 
The following lemma is first obtained: 


Lemma 2. If U is a unitary quaternion matrix (that is, UUS® = I = USTU) 
which is also *-symmetric (U = U*), there exists a complex unitary matrix Z 
such that ZUZ* = Dy + jD where Dy and D are real diagonal matrices for 
which D,? + D*® = I. 


Let U = U; + j U2, where U; and U2 are complex matrices. Since U = U; 
+7 U, = U* = U,* + U,* j, it follows that U; = U,* and U, = U,*. 
Since, also, UUST = (U, + j U2)(UiS? — 7 U2") = IT, UiUiS* + UF ULF =I 
and U,U,°T = U,°U;" or, taking conjugates, U;°U," = U,U;%? or U2°U; 
= U,U:. Let V be a complex unitary matrix such that VU,VST = D = D, 
+ D,+...+D,, where D, = d,J, for d, real, d, ¥ d, for 1 #j, and 
where d, > d2 >...>d; also let V°U,VST = N. Since U2°U; = U,U2, 
VOUS VEVEU,VS = VOU,VSTVU2V% or DN = ND. Therefore N = N, 
+ N.+...+, is a direct sum conformable to D. Since N = NT, N, = Nt 
for all 2; consequently, there is a complex unitary W, for each N, such that 
WNW; = D,, is a real diagonal matrix. If W = Wi + W2+...+ W,, 
then WNWT = Dy, + Din +... + Due = Do isa direct sum of real diagonal 
matrices. Then WV°(U, + 7 U:)VSTW® = WIN +jD)W* =D,+jD 
where Dy and D are real diagonal matrices and WV° is a complex unitary 
matrix. Furthermore, since U, V, and W are each unitary, Do + j D is also 
and (Do + j7D)(Do —j D) = D’? + D = 1; the lemma is then true (and 
the converse is also, incidentally). 


THEOREM 2. If A is a *-symmetric quaternion matrix, there exists a quaternion 
unitary matrix U such that UAU* = D is a real diagonal matrix with non- 
negitive diagonal elements; and conversely. 


This is clearly an analogue of the theorem for the complex case mentioned 
in §1, above; and its proof proceeds as does the proof for the complex case 
given in (7, p. 36). If A = HV = VK is the polar form of the quaternion 
matrix A (see (6)), the proof follows the same pattern except that *-transpose 
replaces 7-transpose and the elements involved are quaternion (though the 
matrix D is still a real diagonal matrix). It is then found that for A = HV = 
VH*, there exists (7, p. 37) a quaternion unitary matrix U such that UA U* = 
UHUSTUVU* = UVU*(U*)°TH* U* = DW =WD where D isa real diagonal 
matrix as there described and W = UVU* = W* is now a quaternion unitary 
matrix. Since D is real diagonal with like roots arranged together along the 
diagonal, W = W, +W.+...+ W, is a direct sum conformable to that 
of D (as a direct sum of scalar matrices) and each W, = W,;* is unitary; it 
may be noted that if D = D, + 0 (as in (7)) and if 0 is present, W, will be 
chosen to have these properties also. By the preceding lemma, a complex 
unitary Z, can be chosen so that Z,W,Z7* = Z,W,Z,* = Do; + j Di; where 








‘S 











CANONICAL FORMS UNDER UNITARY CONGRUENCE 443 


Do, and D,, are real diagonal with the properties given. If Z = Z,; + Z, 
+...4+2Z,, then ZUAU*Z* = ZDWZ* = DZWZ* = D(D, + jD,) = D. 
+ j Dz where D, and D, are real diagonal and ZU is a quaternion unitary 
matrix. 

To obtain the form given in the theorem, an additional step is required. 
Ifa = a + 76,aand bd real, is any complex number, since it is a 1 X 1 matrix 
and is equal to its transpose, there exists a complex unitary (number) u = u, 
+ iu:so that wau™ = uau = r,a real number. If j replaces i in this relation, 
the result still holds (since only j and real numbers are involved); therefore, 
ifa = a +76 is any diagonal element of D, + j Dy, there exists a quaternion 
unitary “ = u, + ju. so that uau* =r is real. If this is applied to each 
diagonal element, the form described in the theorem can be obtained under 
the transformation required. 

The converse follows immediately and the form is a canonical form, the 
diagonal elements being the characteristic roots of the hermitian polar matrix 
of A. 


4. A normal form for a *-skew-symmetric matrix under unitary 
congruence. For this case there is the following lemma: 


LemMa 3. If A is a *-skew-symmetric, unitary quaternion matrix, there exists 
a unitary complex matrix V such that VA V* is a direct sum of 1 X 1 matrices 
of the form + ji and —ji, and of 2 X 2 matrices of the form 


| a | 
—a —jri 


where a® + r®? = 1 anda > 0 and r are real numbers. 

Since A = Ay +jAz2= —A* = —(A,* + A" j), it follows that A, = 
—A,* and Az = —A,*. Since AAST = J = ATA, it follows, among other 
relations, that A24,°T = 4,°A.™ and A,TAs = Ao™A. Since Az is skew- 
hermitian, let U be a complex unitary matrix such that UA,UST = D = D, 4 
D. + D; +... + D, isa direct sum of D, = ir,J, (where r, is real), that is, 


of pure imaginary scalar matrices, arranged as follows: r, # r, if s # ¢t; if 
ir, and —ir, are roots of Ae, their corresponding blocks appear successively 
on the diagonal; all such successive pairs of blocks, if present, appear first 
in D;: and D, = Oif 0 isa root of As. Let U°A,UST = M. 


From A2A,°ST = A,°A>2" it follows that 
UA3USTUA,STUT = UA,SUTUPA,TU", 


or DM°T = M°D*; taking conjugates, DOM™ = MD*T or —D(—M) = 
M(—D) or DM = —MD (since Mt = —M). Therefore, D?M = DDM = 
—D(MD) = MD*. Let D = (D, + D2) +...4+ (Dit DIF Duit... 
+ D, where the parentheses contain the successive pairs described earlier. 
Then M = Mi +...+ Miia t+ Muit...+ My where M,, is of the 











444 J. W. STANDER AND N. A. WIEGMANN 


dimension of D, + D,, M;, is of the dimension of D,, and all M,, and M, are 
complex skew-symmetric (since M is). Furthermore, since —DM = MD, it 
follows that —(D,+ D,)M,, = M,,(D,+D,) and —D,M, = M,D, for 
all M,, and M, involved. Finally, it may be noted that UCA UST = U®(A, + 
j A2) US = USA,US + j UA2US? = M+ j7D must be *-skew symmetric 
and unitary. (Note that U is complex and U°A(U°)* = USA U is *-skew 
symmetric since A is also.) 

(a) Consider, first, any relation —(D, +D,)M, = M,,(D, + D,) and, 
for convenience, the case where r = 1 and s = 2. Let D, + D, = ril, + 
(—rt)I2 where I, and J; are, respectively, p X p and g X q identity matrices, 
r ~ 0, and assume, for specificity, that p S g. Let M2 be of the form 


| M, Ms | 

—M;* M, 

where M, and Mz are, respectively, p X p and qg X gq matrices. From the 
relation —(D, + D:)Mi2 = Mi(D; + Dz), it follows that Ms and M, are 
zero matrices (since r # 0). Now M; may be a zero matrix or it may not; before 
proceeding further, consider the latter case. 

If M;, a p X q matrix, is not zero, by a theorem of Eckert and Young (1) 
it follows that there exist complex unitary matrices V and W, of orders 
b X p and g X q, respectively, such that VM;W = D is a p X gq diagonal 
matrix with non-negative real elements (at least one of which is not 0 here) 
along the diagonal. (A p X q matrix is diagonal if the only non-zero elements 
are of the form a;;.) Form the matrix 


0 ey 
oo ae 


which is complex unitary. Then X (Mj. + 7 Diz)X* = XMy2X* + j X°D2X* 
is a matrix of the form 
> 2 | 2: 0 | 
p ojo —D, 

where D is the above-mentioned » X gq diagonal matrix. Let Ny = XM,.X" 
and N2 = X°D,.X", and note that the dimension of D. = g 2 p = dimension 
of D,, that D, and Dz have non-0 diagonal elements, and D has at least one 
non-zero diagonal element; also, let the non-0 diagonal elements of D appear 
first along the diagonal. Consider V, and V2 and perform the following opera- 
tions on them: interchange the g + Ist column of N successively with the 
qth, g — Ist, g — 2nd,...,2nd so that the g + Ist column becomes the 
second column and all succeeding columns are in the same order as before; 
and also perform the same row operations. This can be accomplished by a 


real orthogonal simularity transformation and there result from N; and No, 
respectively, the matrices 











CANONICAL FORMS UNDER UNITARY CONGRUENCE 445 


0 ay 0 0 —ri 0 0 0 
—a 0 0 0 0 ri 0 0 
0 oO 0 —D;* 0 oO -ril; 0 
0 0 D; 0 0 0 0 ril, 
where J; and J, are, respectively, identity matrices of order g — 1 and p — 1, 


respectively. If the same procedure is applied to the lower right blocks (ignor- 
ing the first two rows and columns of each), it can be seen that a series of such 
steps provides a real orthogonal matrix Y such that the matrix YX(M,. + 
j Dy.)X*Y* isa direct sum of 2 X 2 blocks of the form 


me a | 
—a, jri 


(where a, and r are non-zero and real), and of single elements —jri and +-jri. 
But since YX is complex unitary, so is this direct sum, and so each 2 X 2 
block and jri must be unitary. This means that 7? +a; = 1 and r = 1; 
but since a, # 0, this can only mean that jri and —jri cannot appear singly 
in the direct sum. Therefore YX (My. + j Di2)X™ Y* is a direct sum of 2 XK 2 
blocks of the above form where r? + a, = 1, r #0 and a, # O. (If in the 
above p 2 gq, the roles of +jri and —jri are interchanged, but a simple (and 
allowable) operation at the close can still place the element —jri in the 1 — 1 
position.) 

All of the above in (a) occurs if M; is not a zero matrix. If M; = 0, then 
Me tj Dw =j Dw =] (Di 4+ Ds) which is a direct sum with diagonal 
elements + jr i, r? = 1; in this case no X and Y are required. 

Therefore in U°A UST = M + 7D, each M,, + j (D, + D,) can be treated 
as above depending on whether or not M,, is a zero matrix. 

(b) Consider any relation —D,M,;, = M,D, where D, is a non-0 pure 


imaginary scalar matrix. Then M, = —M;,so M, is a zero matrix and M,; + 
j D; = j D; which has diagonal elements jri, r? = 1. 
(c) If D, = 0 is present in UA,US? = D, then M, + 7 Dy = My = —M;,,* 


a complex unitary matrix. By Lemma | there exists a complex unitary matrix 
U such that UA U* = E isa direct sum as described in the lemma. 

If the results of (a), (b), and (c) are combined, it is evident that a complex 
unitary matrix W can be constructed so that WU°A USTWT = W(M + jD)W* 
is a direct sum of 2 X 2 matrices of the form 


| jri a | 

—a —jri 

(where a? + r? = 1, a > O and r are real) and of 1 X 1 matrices of the form 
jiand —ji. 


THEOREM 3. Jf A is a *-skew-symmetric quaternion matrix, there exists a 
quaternion unitary matrix V such that VAV* = E +0 where E is a direct 











446 J. W. STANDER AND N. A. WIEGMANN 
sum of 1 X 1 matrices of the form kji and —kji,k > 0 real, and of 2 XK 2 matrices 


of the form 
| sji t | 
—§ —sjit 
where t > O and s are real. 

The proof follows the pattern of that of Theorem 1. If A = 0, the result is 
trivial. If A #0, let A = HU = UK be a polar representation of A. If 
*-transpose replaces 7-transpose in the earlier proof, it is evident that H = K*. 
Here, however, the rank of a *-skew-symmetric matrix is not necessarily even 
(as the preceding lemma shows). If the earlier proof is followed, it is seen 


eventually that, using the same letters, U = V,°T(Ay, + X)V,*°T and U* = 
- Vi9T (Ay + Y)V;*% so that U* = V,°°(Aq* + X*)V;S? = — VST (Ay 
+ Y)V,** and, since V,°™* = V,*°T, Ay* = —Ay, is quaternion uni- 


tary. Then V,AV;* = V,HV,°TV,UV;* = (Di + 0)(An + X) = (DiAn + 
0) = Vi(—U*) Vit Vi*OTH*Vi* = (An + Y)(Di + 0) = (AnD: + 0). Since 
DiAu = AyD,, An is a direct sum, A; + A2 +... + A;,, (of *-skew-sym- 
metric, unitary quaternion matrices) conformable to the direct sum of D). 
For each A, there exists, by the preceding lemma, a complex unitary matrix 
W,so that W,A,W,* has the form described in the lemma. If W = W,; + W, 
+...4+W,+1 (where J is of the order of 0 in D, +0), WV,AV,*W7 is 
then a direct sum of 1 X 1 matrices of the form kji and —kji (k > 0 is real), 
of 2 X 2 matrices of the form 


| jret A 
—ac —jret 


where ac > 0 is real, and of a zero matrix. (WV, is a unitary quaternion 
matrix.) 


REFERENCES 

1. C. Eckert and G. Young, A principal axis transformation for non-hermitian matrices, Bull. 
Amer. Math. Soc., 45 (1939), 118-121. 

. N. Jacobson, Lectures in abstract algebra (New York: D Van Nostrand, 1953), 184. 

. C. C. MacDuffee, The theory of matrices (Chelsea, 1946). 

. S. Perlis, Theory of matrices (Cambridge, 1952). 

. I. Schur, Ein Satz ueber quadratische Formen mit komplexen Koeffizienten, Amer. J. Math., 
67 (1945), 472. 

6. N. A. Wiegmann, Some theorems on matrices with real quaternion elements, Can. J. Math., 

7 (1955), 191-201 


ark wn 





7.— On unitary and symmetric matrices with real quaternion elements, Can. J. Math., 8, 
(1954), 32-39. 
8. J. Williamson, A polar representation of singular matrices, Bull. Amer. Math. Soc., 41 (1935), 


118-123. 
9. A. Wintner and F. D. Murnaghan, On a polar representation of non-singular square matrices, 
Proc. Nat. Acad. Sci., U.S.A , 17 (1931), 676-678 


Catholic University 
Washington, D.C. 











ll. 








ON NILPOTENT PRODUCTS OF CYCLIC GROUPS 
RUTH REBEKKA STRUIK 


Introduction. In this paper G = F/F, is studied for F a free product 
of a finite number of cyclic groups, and F, the normal subgroup generated by 
commutators of weight n. The case of nm = 4 is completely treated (F/F; is 
well known; F/F; is completely treated in (2)); special cases of m > 4 are 
studied; a partial conjecture is offered in regard to the unsolved cases. For 
n = 4 a multiplication table and other properties are given. 

The problem arose from Golovin’s work on nilpotent products ((1), (2), 
(3)) which are of interest because they are generalizations of the free and 
direct product of groups: all nilpotent groups are factor groups of nilpotent 
products in the same sense that all groups are factor groups of free products, 
and all Abelian groups are factor groups of direct products. In particular (as 
is well known) every finite Abelian group is a direct product of cyclic groups. 
Hence it becomes of interest to investigate nilpotent products of finite cyclic 
groups. 

Golovin has done this (as well as other things) in (2) and (3). In (2) there 
are results for the first nilpotent product (metabelian product) and in (3) 
there is a unique decomposition theorem for nilpotent products of finite cyclic 
groups. 

It might be conjectured that all finite nilpotent groups are nilpotent pro- 
ducts of cyclic groups. However, in (2) and (3) Golovin notes examples of 
non-Abelian groups with ((G, G), G) = 1 which are not of this form. Here it 
is shown that the Burnside group with exponent 3 (with three or more 
generators) is not of this form. 

To be more precise, and using Golovin’s notation: Let 


F= [].4:* A, 
be the free product of the A,. Let (a, 6) = a~'b-'ab and (A, B) = | (a, d)| 
a € A, b € B} where A and B are subgroups of a group. Let (A,) = {(A,, A;)| 
t * j} where the A, are considered as subgroups of F (the i in (A,) is to 
indicate that it is formed from the A, in F). Let o(A,)r be the normal sub- 
group generated by (A,) in F, .(Adr = (e-1(Aar, F). Then according to 
Golovin (1), the kth nilpotent product of the A, is 


G = A,(k)A2(k)... (RJA, = F/ (Ade. 
(If the A, are cyclic, then G = F/ Fy42.) 
From now on, Golovin’s notation will be dropped. 
Received October 17, 1958; in revised form January 21, 1960. 
447 











448 RUTH REBEKKA STRUIK 


In (6) it is shown that if F is a free group with a finite number of generators, 
then every element of F/F, can be uniquely expressed as a product of standard 
commutators. Here it is shown that if F is replaced by a free product of cyclic 
groups, then Hall’s results hold “essentially” provided that all primes appear- 
ing in the orders of the factors are > n — 1. If the primes are < nm — 1, then 
the situation is complicated. The case m = 4 is completely treated here (that 
is, p = 2, = 4); partial results and conjectures are offered for n > 4 and 
p<a-1. 

Section 1 gives preliminary results. In § 2, the ‘‘well-behaved”’ case (p > 
n — 1) is handled, and in § 3, the other cases are discussed. 

The author would like to thank W. Magnus for encouragement while 
preparing this paper, and R. Ree for reading the manuscript and for helpful 
criticisms. The author is also indebted to the referee for many improvements. 


1. Preliminaries. Let G be an arbitrary group. As usual, (a, 6) = a~'b~ab 
for a, 6 € G and if A, B are subgroups of G, then (A, B) = {(a, d)\a € A, 
b € B}. The lower central series of G is an infinite sequence of subgroups, 
G,, G2,..., where G; = G, G2 = (G,G),..., Geri = (Ga, G). (((a1, G2), Gs), 

.,@,) will often be abbreviated (a;,...,a,). An element of the form 
(((@1, @2), (@3, @4)),...,@,) (that is, with arbitrary arrangement of paren- 
theses) will often be referred to as a commutator (of weight m), as opposed 
to a member of G, which is (in general) a product of commutators (of weight 
n or greater). In this paper, F will stand for a free product of a finite number 
of cyclic groups: F = [[*A «, A; cyclic. (A; may be finite or infinite). The 
following identities are often useful: 


(1) (xy, 2) = (x, z)((x, 2), v)(y, 2) 
x 


(x, yz) = (x, 2) (2, (vy, x)) (x, y) 


In (6), the following theorem is proved: 


THEOREM H1. Let F be a free group with t generators, uy, U2,..., U,. Let 


2, 


Ui,..., Us be a sequence of standard commutators of weight < n (See (7).) of 
non-decreasing weight. Then every element, g, of F/F, =G (free nilpotent 
group) can be uniquely expressed as 


gy = I] ini; 


where the c; are rational integers. If 


then 


gh = [| u‘, 
where e; = f:(c;,d,) are polynomials with integer coefficients in the c, and the 
d, (for example, e; = c, + di; 1 < i < 2). If s-tuples of the form (c;,..., Cs), 





Cc 


og a 2 





of 
nt 


he 





NILPOTENT PRODUCTS OF CYCLIC GROUPS 449 


c, rational integers are taken with multiplication given by (ci, .. . , aA fe 
d,) = (filcy, dy), ...,fs(Cy, de)), the set of these s-tuples forms a nilpotent group 
isomorphic to F/ F,. 


Throughout this paper, Hall’s collection process will be frequently used. 
Several of its important theorems will now be summarized: 


THEOREM H2: Let R, S be any two elements of a group; let u, u2,..., be a 
fixed sequence of commutators in R and S of non-decreasing weight, that is, 


= (R,S), uz = ((R, S), R), us = ((R, S), S), etc. Then 


(2) (RSf oo RF .. af... 
where 
@) Jun) = ax(3) +09(3) +--+ amg) 


a, are rational integers and w, is the weight of u, as a commutator in R and S. 
(2) is an identity if the group is nilpotent; otherwise (2) can be considered as 
giving a series of “approximations” to (RS)" modulo successive members of the 
lower central series. 


The proof of Theorem H1 also gives 


THEOREM H3. Let Ri, Ro, ..., R, be any s elements of a group. Let uy, uo, 
be a fixed sequence of commutators in the R, of non-decreasing weight (weight 
> 2). Let i;, i2,...,%, be any fixed permutation of 1,2,...,5. Then 
(4) me... BF «EE... ...a... 


where f ,(n) are of form (3) with w, the weight of u, in the R;. 


From Theorem H1 we can obtain 


LemMMA H1. Let X, Y be any elements of a group, and let uy, u2,... , be any 
fixed sequence of commutators in X and (X, Y) of non-decreasing weight; then 
(5) x", F) @ (. faa .. a5 oe 


where the f ;(m) are like (3) with w, as the weight of u, in X and (X, Y) 
Proof of Lemma H1. (5) follows from (2) in view of 
(X", ¥Y) = XY UX*Y =X Y XY)" = XIX (X, VY)" 
ats, reat... @ (, Ife .... 
LEMMA H2. Let a be a fixed integer and G a group such that G, = 1. Then 
if b;< Gandr <n, 
(6) Be. ccs hnn Sa Oan«<- te) @ Oy....b58 Be ces 


where the v, are commutators in b;,..., 6, of weight > r, and every b;,, 1 <j<r 
appears in each commutator v,. The f; are of form (3) where w, is the weight 
of v; minus (r — 1). 











450 RUTH REBEKKA STRUIK 


Proof. (6) is (5) with r= 2,1 = 1, and a =n. For r = 2, i = 2, and 
a = n, take inverses on each side of (5). 


(7) (Y,X°) = u,*™ ... as *®@(Y, XY 
where u, € G,_1. Since G, = 1, s is finite. Now apply (4): 
(8) (¥,X*) = ay... wr! O(Y,X)* [Ri = (Y,X) or uy] 
m= ((Y, Xa a ee Owe . .. 


where w, are commutators in (Y, X)* and u;-%@, Use induction starting 
with (Y, X) € G,-1. For (Y, X) € G,-;, assume the theorem (that is, (6)) 
is true for commutators € G,_;41, and use this and (1) to express w, in desired 
form. One will obtain as exponents in the expansions, expressions of the form 


() 
(9) j 


From its meaning in terms of the number of subsets of a set, (9) is an integral- 
valued function of a (of degree i X j). By (3.21) p. 64 of (5), this can be 


expressed in the form 
a a 
a —— a Fw 
(3) + + aoa; ¥ 5) 
a, rational integers. This is sufficient to show 
(10) (Y,X*) = (Y,X)* [] oft 
which completes the proof for r = 2. 


Suppose true for r, then for 


(11) ee, ee Oe i>?2 
put b; = (¢1, C2), bs = Cozi, 2 = 2,...,7 in (6) and use induction hypothesis. 
For 
(12) | i ee 
put 

X = (G, Ca... Cr), ¥ = Cor 


By induction 
X = @,...,¢)° Il of. 
Now use (1) an appropriate number of times with 
SPB Cs, -«-5be) + & or Cr4t 
and the induction hypothesis to put 
(13) (X, Y) = ((ex,... 565)" [] wht, cogs) 
in the form of (6). 














l 





ng 
))) 
ed 


rm 


be 


> 2 


sis. 

















NILPOTENT PRODUCTS OF CYCLIC GROUPS 451 


A similar proof holds for 
(¢1, C3, C3, 224 Cr1)- 


Throughout this proof, we have implicitly used the fact that an arbitrary 
commutator can be expressed as a product of commutators of the form 
(b;,..., 6,). Or to express the same idea in a different way, (6) can be proved 
in the same way, if (b:,..., di-1, bf, Biui,...,5,) and (b;,...,6,) are 
replaced by arbitrary commutators (that is, monomial commutators with 
parentheses arranged arbitrarily). 


Let gcd stand for greatest common divisor and gcd(a;,...,a,) stand for 
the gcd of the rational integers a;,...,a,. The gced(a,...,a,,0) = ged 
(a;,...,@,). This should not be confused with (a;,...,a,), a member of 


G,, G a group, since a; € a group, and will not be rational integers (in this 

paper). A cyclic group of order 0 will be understood to be infinite cyclic. 
LEMMA 1. Let 

F _ I] imi” A ty 


A, cyclic of order a,. Let a; generate A,. Let n > 3 be a fixed positive integer, and 
let all primes appearing in the factorizations of thea; >n — 1. Let G = F/F,. 
If v € G, and 


v= (a4, *“*-* »2y), 


then v’ = 1, where 
N = gced(aq,...,@y), &R>2 


(some of the «,; (or ay,) may equal each other). If wis a product of commutators 
like v in which every commutator contains all the distinct a; appearing in v, then 
w’ = 1. Hence w’ = 1 where w is an arbitrary commutator. 


Proof. Let 
= Ga... «+s Ge.) © Gee 
By (6) 


(14) 1= ee ee, = (24,,..- Bin Sp I] Umit 


l<j<n-l 
where all v,, = 1 since G, = 1. Hence the Lemma holds for k = n — 1. Since 
G,-1 is Abelian, w” = 1 if w is a product of commutators of weight n — 1 
in which the same a; appear in each commutator. 
Suppose true for k + 1, that is, if 
p= (2,,.--,@ua)s 


then v” = 1 where 


N = gcd (ay, ++ y Ay ads 











452 RUTH REBEKKA STRUIK 


and if w is a product of commutators of weight k + 1 or greater in 


a 
then w” = 1. Consider (a,,...,a@,4). By (6) 

- a; @ ‘mn (@ 4;) . 
ee ae a ee ay) *[] of l<j<k. 


Hence 
(@4,..-,@q)" = 1 where N = ged(ay,... , ag). 


Making use of (4), one obtains that if w is the product of commutators of 

weight & or greater in dy,..., ay, then w¥ = 1. Note that every factor of 

w must contain all the distinct a,;,, and that in a nilpotent group, every com- 

mutator can be expressed a product of commutators of the form (a,,,..., a; 
For the case m = 4, (5) becomes: 


LeMMA 2. If G is any group and a, b € G, then 


(16) (a’, b*) (a, b)"*((a, b), a)*42) (a, b), by") mod Gi, 


(b’,a*) = (a, b)~*((a, b), a) (2) ((@, b), b) (2) mod G,, 


r v(ry — 1) 
where (:) ae ane 


Lemma 2 is proved in (14) and is a particular case of (5) in which the f,(m) 
have been computed. The proof of (16) is based on the work of Magnus (11). 


2. The ‘‘well-behaved”’ case. 


THEOREM 1. Let A;, Ao, Az be cyclic groups of orders a, a2, a3 respectively, 
a, odd integers. Let a; generate A,. Let 


F = [T° Ay 


Let uy,...,Uis be a sequence of standard monomial commutators of non- 
decreasing weight in a, 42,a3 of weight < 3. (See (7).) Let Ny = a, if uy, is 
of weight 1; let N; = gcd(a;,a,) if uy = (a;,a,;), and let Ny = ged (ay, a;, a) 
if @;,@;, a, appear in u,; of weight 3. Then every element of g of F/F, can be 
uniquely expressed as 


(17) g= [| ui 


where the c, are integers modulo N;. If 


h=[[ ui 

















| of 
of 
ym- 


n) 


(n) 


1). 











NILPOTENT PRODUCTS OF CYCLIC GROUPS 453 


is another element of F/F4, then 


gh = |] ui 
where e, = f :(c;, dy) are the polynomials with integral coefficients of Theorem H1. 
(Theorem 1 is a generalization of a lemma appearing in (15).) 


Proof. By Lemma 1, u,*i: = 1. Hence every element of G can be expressed 
in the form of (17) where the c; are integers modulo V,. The problem is to 
show that this expression is unique. 

Let 3, ..., U4 be ay, de, G3, (a1, de), (@1, 23), (22, G3), (@1, G2, A), (@1, Gs, 21), 
(2, 23, G2), (G1, A2, G2), (1, As, As), (A2, 3, G3), (G1, G2, Gs), (@2, 3,1), respec- 
tively. If another sequence of standard commutators is chosen, a similar proof 
will hold. Since (a;,a,;), 1 # j generate (G,G) modulo G; and since 


((a, b), c)((, c), a) ((c, a), 6) = 1 modulo G, (see (11)) 


and (a;,a@,;,a,) generate G; modulo G,, the u, specified above do form a 
basis for G. The following change of notation will be made: 


let u,; = (a;, a,) and designate the corresponding ¢,;, d;, e, by ¢4;, diy, €4; 
respectively ; 


Let ui;; = (ay, a@;,a,) and designate the corresponding c;, di, e; by C454, 
dizi, Ciy¢ Where 1 < 7; 


let u,,;; = (a;,a,,a,;) and designate the corresponding c;, d;, e, by Cx, 


dijy Ciyy Where 1 < j; 


let usin = (G4, a5, 4%) and designate the corresponding c;, d;, e; by Ci, 
dij, Cin Where 1 <j < k; 


let jx; = (Gs, dy, a;) and designate the corresponding c,, dy, e; by Cys, 
dni, Cx, Where i <j <k. 


For Theorem 1, wu; and u; are tues; and u23 respectively, but the more 
general notation is used here for the sake of Theorem 2. 
Then a somewhat laborious computation gives 


ej = Ci + d; 
Cig = Cry + day — Cy 
d; 
Cigt = Cizt + d isi — «(4) + Cid 


(18) 
C159 = Cayy + diyy — a(%) + eid; —ddx; 


Cijn = Cipz + dix + Cx; + Cid, — dk x ail cd d, an cad, 
Ci = Crit d jxi + Cx ; + Cud; = c,d dj. 











454 RUTH REBEKKA STRUIK 


Note that these are the f,(c;, d,) of Theorem H1 for m = 4, and the particular 
sequence of u, chosen here. Also note that they apply unambiguously if they 
are interpreted as integers modulo the appropriate gcd. For example, ¢;2; is 
an integer modulo gcd (aj, a2); ¢2, d;, and cz appear in its formula, but since 
C2, dy, and Cy: are integers modulo a, a;, and gcd(a;, a2) respectively, no 
ambiguity arises in the computation of a particular ¢;2,. By Theorem H1, if 
one takes 14-tuples, (c:,..., Cis), (di, ..., dis), Go, dy rational integers and 
lets (18) define a multiplication, a group isomorphic to F/F, (free nilpotent 
group) (F a free group) is obtained. The same proof will go through if the 
c, d; are integers modulo the appropriate gcd. (One can also check the group 
axioms directly, a tedious verification.) Note that a; odd is essential here, 


since (18) involves 
Ci d; 
2 ’ 2 , 


and this will give difficulty if one is dealing with integers modulo an even 
integer. 


THEOREM 2. Let A,,..., Az be cyclic groups of order a, ... , a, respectively, 
a, odd integers or 0. Let a; generate A,;. Let 


Let tt, U2,..., be a sequence of standard (monomial) commutators of non- 
decreasing weight in the a, of weight <3 (see (7)). Let Ny = a, if u; is of 
weight 1; N, = gcd(a;, a;) if uy = (a;,a;) and Ny = ged (ay, ay, ax) if a4, 2s, O& 
appear in u, (of weight 3). Then every element of F/F, can be uniquely expressed 


as 
g= [|] uf 


where c, are integers modulo N;. (If N; = 0, then c, is a rational integer.) If 


is another element of F/F 4, then 
gh= [| u‘ 
where e, = f (cy, dx) are the polynomials with integral coefficients of Theorem H1. 


Proof. The proof is the same as that of Theorem 1. (18) is a multiplication 
table for G provided the standard commutators are arranged in the order: 
Qi, (As, 45), (4, Ay, @;), (Ay, Ay, As), (4, Ay, Ay), (Ay, Gy, @,) With 4 <j < k. 


Comment. Since every finite nilpotent group is a direct product of prime 
power groups, the a; may be assumed to be prime powers or 0. 


COROLLARY 1. Let 


g= [|] uf 








en 











NILPOTENT PRODUCTS OF CYCLIC GROUPS 455 
be a particular element of G. Then g% = 1 where N is the least common multiple 
of the orders of the u,“ appearing in g unless g ¢ (G, G) and 3|N. In the latter 
case, g°** = 1, and g may be of order 3N. If any of the u, appearing in g are 
infinite cyclic, then g is of infinite order. 


The author is indebted to the referee for a simplification of the statement 
and proof of this corollary. 


Proof. lf g © (G, G), then since (G, G) is Abelian, the Corollary follows. If 
g contains a u, which is infinite cyclic, then by (4) and the unique repre- 
sentation of g, g must be infinite cyclic. If g ¢ (G, G), and all the factors are 
of finite order, then at least one of the u, is equal to an a,. Looking at (4) 
with the » of (4) put equal to N, it is obvious that g¥ = 1 (Lemma 1 is used 
here) provided 37 N, since the f;(N) will involve V, 


(*),. 1(¥) 
2 » anc 3 ° 


(All commutators are of weight < 3). 

If 3|.V, i.e., 3\a, for an a, appearing in g, then the above reasoning indicates 
that g** = 1. g can actually be of order 3; for example, if 
(19) G={a,d|e*=8=1, G,= 1} 


an actual computation shows that ab, ab’, a*b, and a*b? are of order 9; in 
this case 
(20) (a‘b’)* € G3. 


Another way of seeing this is to consider equation (7) of (14) (due to 
Sanov) that is, 
21) ((a, 6), 8% € F(N) Fe 


where F can be any group generated by a and 6 and F(N) is the normal 
subgroup generated by all NV = 3N’ powers of elements of F. If a and 6 are 
of order N and if all elements of F/ F,4 were of order N (or less), then ((a, 5), 5) 
would be of order <4$N and not N as Theorem 2 indicates (that is, ¢ = 2, 
a, = a, = N). 


Comment. The group G given by (19) is a kind of curiosity, for p-groups, 
since it is mot regular in the sense of Hall (5, p. 73). However all groups of 
the form 


a a - 
(22) G = {a,bla” = =1,G,=1} 
with p > 5, p a prime, are regular groups in the sense of Hall. 
A similar comment can be made in connection with 


(23) G = {a,b|a*? = B® = 1,G; = lf}, 


a group of order 8. 











456 RUTH REBEKKA STRUIK 


COROLLARY 2. The group S,= {al <i<t,s?'=1,s€S,} is not a 
nilpotent product of cyclic groups of order three, except fort = 2 when S; = F/ Fs, 
F = {a,}*{a2}. However, S, is a fully regular product (see {1}) of the {a,}, and, 
in particular, it is the third Burnside product of the {a,} (12). 


Proof. The only candidates for S, to be a nilpotent product are the first 
(F/F;) and second (F/F,) nilpotent products. (F a free product of cyclic 
groups of order three.) Since ((a;, a2), @;) # 1 in F/F, while ((a;, a2), a;) = 1 
in S, (cf. (9)), S; cannot be a second nilpotent product. As for the first nil- 
potent product (that is, F/F3), (a1, @2, a3) = 1 in this case, while (a;, a2, a3) #1 
in S,. However, if ¢ = 2, S. = F/F; where F is the free product of two cyclic 
groups of order three, and S; = first nilpotent product of {a,} and {a2} (2, 9). 


THEOREM 3. Let A;,..., A, be cyclic groups of order a;,... , a, respectively. 
If A, is infinite cyclic, let a, = 0. Let a; generate A,; let F =[]%.* Ay. Let 
n > 3 be a fixed positive integer and let all the primes appearing in the factori- 
zations of thea; >n—1. Let um,..., be a sequence of standard monomial 
commutators of non-decreasing weight in the a, of weight < n — 1. Let N; = a, 
if u, of weight 1, and 


N, = ged(ay,...,a@4) if ay, lS j<ck, 
appears in u,;. Then every element g, of G = F/F, can be uniquely expressed as 
g= |] uf 
where the c, are integers modulo N,. (If N; ;=0, c; is a rational integer.) If 


h= [| uf 
is another element of F/F,, then 
gh = [| u‘ 
where e; = f:(c;, dy) are the polynomials with integer coefficients of Theorem H1. 


We note that if F were free, the u; of weight k would form a basis for 
F,./ Fr+1, see (7). 


Proof. The proof is exactly the same as that of Theorems 1 and 2. Lemma 1 
shows that the orders of the u; are as stated in the theorem, so that every 
element of g is of the form stated, and the only problem is uniqueness. As 
in Theorem 1, one can theoretically compute a multiplication table similar 
to (18). This is computed by multiplying 


¢1 c a d 
<r yy P 


a-1 


d daiy d 
uy. ey... aes as! us? (us, wi) a... 


etc., and using (5), (6), or (10), or a suitable modification of them. The 
coefficients of the multiplication table will involve 


3). @) oat) 














— ak (oe 





or 

















NILPOTENT PRODUCTS OF CYCLIC GROUPS 457 


Note that the f,(#) of largest order will come from applying (5) and (10) to 


d ; da . . 
(us*,ui') or (uj’, ut, )i<j 
and since in (5) one is dealing with commutators in X and (X, Y), the corre- 


sponding coefficients of the f;(c,, d,) will involve at most 


Rap —_ Et). % ): 


Hence, since all the primes of the a, > m — 1, no ambiguity will occur because 
the c, and d, are taken modulo the appropriate gcds. Hence Theorem H1 can 
be used with the f;,(c,,d,) considered as integers modulo the appropriate 
gceds, and this is sufficient to prove the theorem. 


CorOLuary. Let 
ge= |] u¥ 
be the unique representation of an element of G. Let N be the least common 
multiple of the orders of the u,“ appearing in g. 


Case I. If one of the u, is infinite cyclic, then g is infinite cyclic. 


Case I1. All the primes appearing in the orders of the u, are greater than n — | 
oR g © (G,G). (g ts assumed to have factors which are all of finite order.) Then 
g” = 1. 

Case III. g ¢ (G, G) and p (a prime) = n — 1 and p appears in the factori- 


zation of one of the a, where a, is a factor of g. Then g?% =1, and there are cases 
where g® # 1. 


Proof. Case I follows from (4) and the uniqueness of the representation 
of g. (Consider what happens in (4) to the infinite cyclic u, of least weight.) 
For Case II, consider (4) where R; = u,% (of g). If every u, (of g) © Gas, 
then (4) gives g¥ = 1. If g © G,_,, use induction on s, (4), (6) Lemma 1, 
and the fact that the u, of (4) can be expressed as products of commutators 
of the form 


lies 0s « oa 


If all the primes appearing in the a, of u, (of g) are greater than m — 1, the 


f(N) of (4) will involve 
op ( 4 ) 
Dll Oe Yee 


(N) 
ui’ = | 


and N | f,;{N), hence 


and g* = 1. If g € (G, G), then the same proof holds except that the f,(V) 
involve 











458 RUTH REBEKKA STRUIK 


—). 


For Case III, if p = » — 1, pa prime, and for some a, (appearing in g, pla, 


may cause difficulty, but in any case, 


| (pn 
n| (2) 
, p 
and hence g?* = 1. If ay = ag =... = a, = p = N where p = n — 1, then 
according to Sanov (13), 


(a1, 2,...,42)” € FP) Fui=Fp)Fe (p=n—1) 


pb — 1 times 





where 
F(p’) = {x"|x € F). 
If g = 1 for every element of F/F,, 
(@1, @2,.... G2) 
pb — 1 times 
would have order p*—' or less which contradicts Theorem 3 (according to 


which (q;, d2,..., 42) has order p*). Hence there exist elements which have 
order p+! = pN. In view of the Corollary to Theorem 2, probably 


(aya:)”" xé 1. 


Comment. lf a? = 1, 1 <i<t,n < p, pa prime, all elements of G are 
of order p, and hence G is a factor group of the Burnside group B with ex- 
ponent p in ¢ generators. In (10) and (13) it is shown that B,/B,,, _ is the 
same rank as F,/F,,, (F the free group with ¢ generators) for s = 1,2,..., 
p — 1. This provides a partial verification of Theorem 3. 


Comment. In (4) Gruenberg states and proves “Hall’s Second Basis 
Theorem.” It is essentially Theorem 3 for the case a; = ag = a; =.. 
=a, = p and n < p. Theorem 3 shows that Hall’s Second Basis Thesen 
holds “‘one step further” for » = p + 1. 


3. The “‘ill-behaved”’ case. If » < m — 1, the proofs above break down 
The case of A = {a}, B = {b}, a? = Bb? = 1 is of interest. In F = A*B (the 
free product of A and B), (A, B) is infinite cyclic and generated by (a, 3). 
Since 














Na 

















NILPOTENT PRODUCTS OF CYCLIC GROUPS 459 


(24) 1 = (a, b*) = (a, b)*((a, 5), 5), 
(a, b)? € Fs. Similarly 
1 = ((a, b), b*) = ((a, b), b)*(((@, b), 6), 6) = (a, b)-*(((a, 5), 5), ). 


By induction, 
Qn-2 


(a, b) 


€ F,; 
hence in F/F,, 


qn-2 


(a, b) = |], 
By (8), the F,, 2 = 1,2,..., are all distinct and hence (A, B) in F/F, is 
exactly of order 2"-' and F/F, is of order 2". 

That this is not a freak case can be seen from Theorem 4 below. Since 
finite nilpotent groups are direct products of prime power groups, it is sufficient 
for n = 4 to discuss the case of p = 2. 

THEOREM 4. Let A, = {a;}, 1 <i <t be cyclic groups of order 2". Let 
71S 72K... Srp LA P= [T*,.1° A, Let G = F/Fy. Then every element 
of G can be expressed uniquely in the form 

fe) (3) 


oe ci. ¢2 ct cij 2 C5; 2, ¢; 
(25) aj'as’...a%' [] (a, a,) (a;,a@;) *’ (ay, a5) *’. 
i<j 


I] (es, a,), ax)“**((ay, ax), a) 


i<j<k 


where the C4, Cis, Cis, Cage, Cyey are integers modulo 

he Qritl 9-1 ON OT 
respectively while c,; are integers modulo 2"*—', if r; = rj, and 2" ifr, A r,;. In 
particular, (a;,a,) 1s of order 2"**' for 1 # j. 


Formulas for multiplying two elements of G are given below. 


Proof. Let a, b, c be three of the a, of orders mq, m, n,, respectively, 
Na <M, < n,. By (16) 


1 = (a,b) = (a,b)"(a, b, a)", a,b € G. 
From the work of Magnus (11), it follows that 
1 = (a, b,a)"* = (a, b, b)"* = (a,b, c)"* = (b,c, a)" intG. 


Since (G, G) is Abelian, and (%*) = ,/2 (mod m,) 


(26) (a, b)*"* = | 

and 

(27) (a, b)-"* = (a, b)"* = ((a, b), a). 
If mn, = m, the same reasoning gives 


(a, b)"= = ((a, 5), b)i-. 











460 RUTH REBEKKA STRUIK 


However, if m, < mp», all that can be said is ((a, 6), 6)" = 1. In view of (26) 
and (27), computing a multiplication table using a representation such as 
(18) would be somewhat complicated; to avoid this difficulty, note that in G 


a*, b) = (a, b)*((a, 5), a) 
(28) ( gai 
(a, 6?) = (a, b)*((a, d), b) 
and hence { (a, 5), ((a, 6), a), ((a, 6), b)} = { (a, d), (a, b*), (a*, b)}. Now, using 


(27) and the fact that (G, G) is Abelian, 

(a?, bis = (a, b)"=((a, 6), a) = 1. 
If m, = mp, then (a, b*)i+ = 1, while if nm. < mp, 

(a, b?)"* = (a, b)*((a, db), b)"* = 1. 


Hence every element of G can be expressed in the form of (25). If one multi- 
plies two elements like (25), that is, let 


c=aias... (a,,a;)... ((@;, @;),@,)°"... 
did \ dij a; 

d = a‘'a3’... (a;,a;)°""... ((a;, @;), ay)... 

e=aja:... (a,,a;)°" ... ((ay, @;), a)" .. 


with e = c-d, then 
eC, = Cy + d; 


Cij = Cry + di; — 2a(c.;)d; = 2a(c4;)d,; —_ cy + 2<,(%*) 
‘ Cy ‘ 
+ 2d; 9 + 2cad, 


(2 2 2 1 
(29) eS = cS + o + a(C4;)d, sa «(4') 


ey = cy + dy — a(%) + a(c)d, — edd, 


Cin = Cin + dix + a(Cy)d; - od di; - dC x oa a(c,;)dy is cd dy 
Crs = Cait dart alcy)d; + a(ca)d, — odd, 
where 
alc) = Cry + cis + ers. 
Here there appear to be a few problems as to ambiguities, since, for example, 


d, is an integer modulo 2% and appears in the computation of e;; which is an 
integer modulo 2’*+'. However, if d; is replaced by d; + 2%, then 


— ca +2 «(4) + 2a (“) 





re 





d, 


yle, 
an 





NILPOTENT PRODUCTS OF CYCLIC GROUPS 461 


remains unchanged modulo 2’+!, Similar reasoning applies to other cases of 
apparent ambiguity. 
We can now proceed as in the proof of Theorem 1 and construct a group 


H made of 
1+a(s) +2(t) ~ tpl 
2 3 pies 


with multiplication as indicated by (29). The verification of the group axioms 
is straightforward, but tedious. 

It might be asked whether or not a modification of (18) could not be used 
instead of (29). There are several difficulties: in the case of p = 2, the e,, 
are integers modulo 2’**', but c,, d; which appear in the formula for e,, are 
integers modulo 2” (assuming r; = r,). Similarly if r; = ry, e4;, is an integer 
modulo 2% and (#) will cause difficulties, since it is not unambiguously 
defined modulo 2”. If one decides to let c,, be integers modulo 2%, then the 
fact that 


ri ri-l 
(a4, a5)" = (a4, as, a4)" (see (27)) 


means that the multiplication formulas would have to take into account in 
some way the fact that the order of (a;,a,) is 2’**'. The author tried to 
think of a device to get around these difficulties, but was unable to do so. 

If one attempts to carry out computations for the general case, with 
pb <n — 1, then by using (5) and (6) one readily obtains Lemma 3 below. 
Since nilpotent groups of finite order are direct products of p-groups, we 
consider only the case of p-groups here. 


LemMA 3. Let A;,..., A, be cyclic groups of order 
rp” p™ 
respectively. Let a, generate A,;. Let F = [1 ‘..a* A, Let G = F/F,; let 
v= (a4, Diss ++ , ai,) € G,. 
Let 
a = min (a4, eeey i, )- 
Then 
(30) wv” € Gr p—1) 
wv" € Gr4.(44-00-D j=0,1,2.... 


If w € G,, then w can be substituted for v in (30). 


Proof. The proof follows by induction (r = m — 1,m — 2,...,) and uses 
(6) and (4). 

Note that (20) is a special case of (30) with w = a‘b’,r = 1,p = 3,a = 1, 
j = 0, n = 4. Similarly, using group (23), one obtains another special case 











462 RUTH REBEKKA STRUIK 


of Lemma 3, with w = ab, r= 1, p= 2, n =3, a= 1, j = 0. This gives 
rise to the conjecture that these may be the best possible results in the 
following sense: 


Conjecture. In the notation of Lemma 3, the order of v is p**’, where j is 
the least integer such that 
r+(U+1)(o-1) >a. 


However, the author was unable to think of a way to prove that the order 
of v is exactly p**’ and not something less, nor of a manageable method to 
solve the general case of p < n — 1. 


REFERENCES 


1. O. N. Golovin, Nilpotent products of groups, Mat. Sbornik N.D., 27 (69) (1950), 427-454 
Amer. Math. Soc. Translations, 2, 2 (1956), 89-115. 


2. - Metabelian products of groups, Mat. Sbornik N.S., 28 (70) (1951), 431-444. Amer. 
Math. Soc. Translations, 2, 2 (1956), 117-132. 
3. —— On the isomorphism of nilpotent decompositions of groups, Mat. Sbornik N.S., 28 (70 


(1951), 445-452. Amer. Math. Soc. Translations, 2, 2 (1956), 133-140. 

4. K. W. Gruenberg, Residual properties of infinite soluble groups, Proc. London Math. Soc., 
Series 3, 7 (1957), 29-62. 

5. Philip Hall, A contribution to the theory of groups of prime-power order, Proc. London Math. 
Soc., 36 (1934), 29-95. 

6 — Nilpotent groups, Lecture Notes of Summer Seminar, Canadian Mathematical 
Congress (University of Alberta, August, 1957). 

7. Marshall Hall, A basts for free Lie rings and higher commutators in free groups, Proc. Amer. 
Math. Soc., 7 (1950), 575-581. 

8. A. Karass and D. Solitar, On free products of groups, Bull. Amer. Math. Soc., 63 (1957), 
407 

9. Friedrich Levi and B. L. van der Waerden, Ueber eine Besondere Klasse von Gruppen, 
Abhandlungen aus dem Hamburg Universitat., 9 (1932), 154-158. 

10. R C Lyndon, On Burnside's problem, I. Trans. Amer. Math. Soc., 77 (1954), 202-215 

11. W. Magnus, Ueber Beziehungen zwischen héheren Kommutatoren, }. Reine Angew. Math., 
177 (1937), 105-115. 

12. S. Moran, Associative operations on groups 1. Proc. London Math. Soc., 6 (1956), 581-596. 

13. I. N. Sanov, Establishment of a connection between periodic groups with prime power periods 
and Lie rings. Izvestiya Akad Nauk SSSR Ser Mat., 16 (1952), 23-58. 

14. R. R. Struik, Notes on a paper by sanov, Proc. Amer. Math. Soc., 8 (1957), 638-641. 

15. ———— A nole on prime power groups, Can. Math. Bull., 3 (1960), 27-30. 


University of British Columbia 














yes 


54 


rer. 


er. 


en, 

















TRACES OF MATRICES OF ZEROS AND ONES 
H. J. RYSER 


1. Introduction. This paper continues the study appearing in (9) and 
(10) of the combinatorial properties of a matrix A of m rows and m columns, 
all of whose entries are 0’s and 1's. Let the sum of row it of A be denoted 
by r, and let the sum of column j of A be denoted by s,. We call R = (7; 


rn) the row sum vector and S = (s;,...,5,) the column sum vector of A. The 
vectors R and S determine a class 
(1.1) W = A(R, S) 


consisting of all (0, 1)-matrices of m rows and nm columns, with row sum 
vector R and column sum vector S. The majorization concept yields simple 
necessary and sufficient conditions on R and S in order that the class W& be 
non-empty (4; 9). Generalizations of this result and a critical survey of a 
wide variety of related problems are available in (6). 

Consider the 2 by 2 submatrices of A of the types 


10 0 1 
A. =| | and As=|( a}. 


An interchange is a transformation of the elements of A which changes a 
minor of type A, into type A», or vice versa, and leaves all other elements 
of A unaltered. The interchange theorem (9) asserts that if A and A* belong 
to A, then A is transformable into A* by interchanges. 

The term rank p of A is the order of the greatest minor of A with a non-zero 
term in its determinant expansion (8). This integer equals the minimal number 
of rows and columns which contain collectively all of the non-zero elements 
of A (7). Now let p be the maximal and the minimal term rank for the 
matrices in &%. The interchange theorem implies the existence of a matrix A 
in &% of term rank p (9). Here p is an arbitrary integer in the interval 


(1.2) Pps p<B. 
Let 6, = (1,...,1,0,...,0) be a vector of » components, with 1's in the 


first r,; positions and 0’s elsewhere. The matrix 


5; 
(1.3) A =|... 
bm 
Received April 7, 1959. This work was sponsored in part by the Office of Ordnance Research. 
463 











464 H. J. RYSER 


is called maximal, and A is the maximal form of A. Suppose that the com- 
ponents of R and S are positive. Define 
R’ = (7; — 1,...,%m — 1) 


and let A’ be the maximal matrix of m rows and n columns with row sum 


vector R’. Let the column sum vector of A’ equal 
8’ = (3;,..., 3). 
Renumber the subscripts of the column sum vector S = (s;,...,5,) so that 
_—- ¥ 
Define 
$= 5,—-1 fo? n), 
3 = so = 0, 
and let 
k 
(1.4) M = max( 2 (si - x) (ke =0,..., n) 


One may prove (10) 
(1.5) pB=m— M. 

A simple formula for p analogous to (1.5) for 6 does not appear to exist. 
However, Haber in a forthcoming paper obtains an algorithm that yields an 


effective procedure for the determination of f (5). 
Throughout the discussion we suppose that %& is non-empty and that 


R = (7;,...,%m) and S = (s;,..., 5,) satisfy 
(1.6) oe «.< ae ie om oe 
(1.7) at we 2 S. 


We call the above R and S and the associated YW of (0, 1)-matrices normalized. 
Term rank is invariant under permutations of rows and columns. Thus normal- 
ization does not restrict this concept. Indeed, formula (1.5) for p actually 
requires a normalized S. 

For A = [a,,] in & we define the trace of A by 


(1.8) tr(A) = >> ay, 
i=1 

where 

(1.9) € = min(m, n). 


Fulkerson has recently investigated feasibility conditions for the existence 
of a (0, 1)-matrix of order m with specified row and column sums and 0 trace 
(3). He utilizes the theory of network flows (1; 2; 4) and obtains an especially 
simple criterion for the case in which & is normalized (3). Let ¢ be the maximal 








V 





TRACES OF MATRICES OF ZEROS AND ONES 465 


and ¢ the minimal trace for the matrices in the normalized &. In the present 
paper we develop a trace theory for ¢ and ¢ analogous to the term rank theory 
for p and p. The requirement of positive components on R and S is without 
loss of generality. The ordering of the components in accordance with (1.6) 
and (1.7) does impose a restriction. But this is necessary in order to obtain 
conclusions of the uncomplicated type to be described. 

Let A be in the normalized & and write 


W X 
(1.10) A -(? Uy 


where W is of size e by f (0 < e < m;0 <f <n). For an arbitrary (0, 1)- 
matrix Q, let No(Q) denote the number of 0’s in Q and N,(Q) the number 
of 1’s in Q. Let 


(1.11) ley = No(W) + Ni(Z) 

(e = 0,...,m;f =0,...,2) 
and define 
(1.12) T = [t,,] (e = 0, eT MB c n). 


T is called the structure matrix of the class U. Its elementary properties are 
developed in § 2. Section 3 yields explicit formulae for ¢ and ¢ in terms of 


the entries of 7: 


(1.13) = min {t,, + max(e, f)} 
e.f 

(1.14) = max {min(e, f) — te;} 
eS 





Matrices with an unusually simple block decomposition are shown to exist 
for the case of maximal and minimal trace, and these matrices play an essential 
role in the derivations of (1.13) and (1.14). Section 4 stresses similarities and 
differences in the behaviour of trace and term rank. The paper concludes with 
the determination of the domain of intermediate values for the traces o¢ of 
the matrices in &. This usually consists of all integers in the interval 


(1.15) @<0 <G. 
But certain classes U% exclude ¢ + 1 and others exclude ¢ — 1. 


2. The structure matrix. Let A belong to the normalized class A = 
(R, S) described in § 1 and write 


WX 
9 = 
(2.1) A b I 


where W is of size e by f (O0< e < m;0 <f <n). Let 


(2.2) T = {t,,] (e=Q@,...,m%;f=090,..., n) 











466 H. J. RYSER 


denote the structure matrix of &. This means that 


(2.3) ter = No(W) + Ni(Z) 


oT? es St wee |e 


where No(W) denotes the number of 0’s in W and N,(Z) the number of 1's 
in Z. It follows at once from (2.3) that 


(2.4) tes = Of + (rear Ht... + rm) — (Si +... + Sy) 


Thus the structure matrix is independent of the particular choice of A in Y. 
Note that if 


(2.5) r= N,(A) = 7, +... + tn; 
then the first row and column of 7 are given by 


boy = tr — (S31 +... + Sy), 
too=tr— (r+... +7,). 


(2.6) 


The structure matrix has a number of interesting properties that give 
insight into the combinatorial behaviour of WY. Its entries are, of course, non- 
negative integers and its size is m + 1 by nm + 1. For notational convenience 
we number the rows of a matrix of these dimensions from 0 through m and 
its columns from 0 through n. Let E, be the triangular matrix of order k + 1, 
with 1’s on and below the main diagonal and 0’s elsewhere. Let E,7 denote 
the transpose of E,, and number the rows and columns of E, and E,7 from 
0 through k. Let S be the m by n matrix of 1’s. Then 


For the eth row vector of the product of the first two matrices on the left 
side of equation (2.7) is 


(7 e441 + eee a Vaan — S31 + Ga ecces = = a é). 
If this row is multiplied by the fth column of E,7, then we obtain 
(2.8) ef + (Teua + eee + Tm) a (si + eee + Sy). 


But by (2.4) this is ¢,,. 
If in (2.4) we replace f by f + 1, then 


(2.9) tes =f tet (rai t...+%m) — (sa +... + Sy41) 
(e=0,...,m; f =0,...,8 — 1). 

















eft 














TRACES OF MATRICES OF ZEROS AND ONES 467 


By (2.4) and (2.9), 
(2.10) te sea = bey FO — Spar 
(ce =0,..,m;f =0,...,"—1). 
Similarly, we may deduce 
(2.11) teste = beg +S ess 
(e=0,...,m—1;f =0,...,2%). 


The recursions (2.10) and (2.11) are useful in constructing 7 from a given 
Rand S. 


From (2.10) we see that the eth row of 7 may be written in the form 
(2.12) (too, too + € — Sty ber + € — $2,224 5 be nt +O — Sq). 
S is normalized so that by (2.12), if e < s,, then 
(2.13) fo 2 bsg Bn se a bw 
and if e > s;, then 
(2.14 be G& bag G2. & bem 


On the other hand, if s, < e < s;, then there must exist an integer f[(0 < f <n) 
such that 


(2.15) teo > tes > eee > tes < beet < eee < Fens 


The columns of 7 have an analogous monotonic behaviour. 
The following numerical example affords a simple illustration of the pre- 
ceding remarks: 


R = (4, 3, 2, 2, 1), S = (4, 4, 2,1, 1), 
me4s i 0 
z= # a l 
ai > es. 8 2 8 
/ . = 3. 2-4 6 
cy 2 Cae a oe 
0125 9 18 








3. Maximal and minimal traces. Let ¢ be the maximal and ¢ the minimal 
trace for the matrices in the normalized class &. In this section we develop 
simple block decompositions for the matrices of maximal and minimal trace 
and use these decompositions to derive (1.13) and (1.14). We begin with an 
elementary property of the trace function for the class 4. 


THEOREM 3.1. Suppose the normalized A contains a matrix of trace «. Then 
there exists an A = [a,,] in & of trace o with the 1's in the initial positions on 
the main diagonal 











468 H. J. RYSER 


(3.1) Q3, =... = Gee = I, 


Gerieti =... = Ge = 0 (e = min(m, n)). 
For suppose that we have an A in & of trace o with 


(3.2) Qy, =... = Oe-1,e-1 = 1 (e-—1<o), 
a., = O. 


It suffices to show that it is possible to transform A by interchanges into a 
matrix of trace ¢ with the e leading diagonal elements equal to 1. Now there 
must exist an integer ¢ > o such that a,, = 1. Suppose that 


(3.3) Qe: = Ay, = O. 


Since r, > 7; and s, > s,, there exist integers u and v such that a,, = 1, 
ay; = 0 and a,, = 1, ay = 0. We apply an interchange involving positions 
(e, t), (e, v), (t,v), (¢, 2), and follow this by an interchange involving positions 
(e, e), (e, 4), (u, t), (u, e). This gives a 0 in the (¢, ¢) position and a 1 in the 
(e, €) position. The other diagonal elements are not altered. The remaining 
cases 


(3.4) Qe: = A, = 1, 
(3.5) a, = 1, G:. = 0, 
(3.6) G.; = 0, G., = 1, 


are disposed of by similar arguments. 
We turn now to a study of the maximal trace ¢ for matrices in the class . 


THEOREM 3.2. Let ¢ ~ min(m, mn). Then there exists a matrix A; of trace é 
in the normalized % of the form 


S - : 
(3.7) A-=|* 0 0 
° 0 0 


Here S is a matrix of 1's of specified size e by f (0< eC 4;0 <f < &). The 
matrix is of size g by h and has 1's in the main diagonal positions of Az and 
0’s in all other positions. Moreover, 


(3.8) e+g=ft+h=a. 


The 0's denote zero matrices. 


Consider a matrix A in & with the ¢ 1’s in the initial positions on the main 
diagonal. We have ¢ > 0. The block in the lower right corner of size m — é 
by nm — & must be a zero block. For the row and column sum vectors are 
normalized and if the block contained a 1, then a suitable interchange would 
increase ¢. Now the matrix A may be selected to be of the following form: 











TRACES OF MATRICES OF ZEROS AND ONES 469 


S * R 
(3.9) A=|* ™ 0; 
C, O 0O 


Here 0 is the zero block of size m — & by  — &, 0; and O2 are zero blocks, 
R, has at least one 1 in each row, and C, has at least one 1 in each column. 
For let us consider two vectors X,; and X, from among the first ¢ rows of A 
and let the entries of these vectors total x; and x2, resepctively. Let X, have 
0's in its last m — & positions and let X_ have a 1 in at least one of these 
positions. Suppose X, is above X, in A. If x; > x2, then we may apply an 
interchange involving X,; and X_ that does not shift a 1 on the main diagonal 
and places a 1 in one of the last m — @ positions of X,. If x; = x2, we may 
still apply the interchange and place a 1 in one of the last m — @ positions 
of X,. However, in exceptional cases a single interchange may be available 
for this purpose and this may force a reduction in trace to ¢ — 1. But if this 
is the case, then a second interchange confined to the first ¢ columns restores 
trace ¢. This procedure yields the blocks R,; and 0, of (3.9). Next we work 
on the columns and in the same way. Again a single interchange may be 
available and force a reduction in trace to ¢ — 1. But under these circum- 
stances there is always available a second interchange confined to the first é 
rows and columns that regains trace ¢. This is the case since otherwise an 
interchange exists that restores the 1 to the main diagonal and places a 1 in 
the block 0, contradicting the maximality of ¢. This gives us a matrix of the 
form (3.9). 

Now let S; of (3.9) be of size @ by f. S; must be a block of 1's, for other- 
wise we could increase ¢. An A of the form (3.9) with fixed @ and f we call 
reduced. The & 1's on its main diagonal we call essential 1’s. All other 1's are 
called unessential. Without loss of generality we may assume é < f. 

We now consider a reduced A* in & of the form 


(3.10) A*=| X | 2/0 


Here S; is a block of 1’s of size é by f* — f with f* — f maximal in A*. Among 
all reduced A in & we select that A* with its corresponding f* — f minimal. 
We must allow the case f* — f = 0. But if the minimal f* — f > 0, then S, 
is a block of 1’s that appears in all of the reduced A in W. If Y is not present, 
then (3.7) holds with 4 = 0. Suppose then that Y is present. Then our A* 
has a O in the first column of Y. If the block Z contains no unessential 1, 
then (3.7) holds with f = f*. Suppose, therefore, that an unessential 1 appears 
in the (s, ¢) position of Z, where s is maximal in Z. If ¢ = 1, then there is a 
0 in column ¢ of Y. Suppose that ¢ # 1 and let the (s, 1) position of Z contain 








470 H. J. RYSER 


a 0 or an essential 1. If column # of Y contains only 1's, then we may per- 
form an interchange using unessential 1’s and the 0 in column 1 of Y to obtain 
a 0 in column ¢ of Y. We henceforth require A* to have a 0 in column ¢ (but 
no longer column 1) of Y. 

The preceding remarks imply that the entries in row s of X must be I's. 
For if this is not the case then a single interchange gives a reduced A* with 
a 0 in Ss, contradicting f* — f minimal, or else two interchanges place a | 
in 0, contradicting ¢ maximal. Suppose that X has a 0 present in its (x, 2) 
position, where u < s. Then we may apply an interchange involving this 0 
and the 1 in the (s,v) position of X. If this interchange does not involve an 
essential 1, then a second interchange involving the unessential 1 in the 
(s, 4) position of Z introduces a 0 into S,; or S:. This leads us to the same 
contradiction as before. Suppose then that the interchange involving the 0 
in the (u,v) position and the 1 in the (s,v) position of X does involve an 
essential 1. Consider the case s < f* — @. Then a second interchange involving 
the unessential 1 in the (s, ¢) position of Z regains trace ¢ and introduces a 0 
into S, or Ss. This is again a contradiction. If s > f* — @, then the second 
interchange involving the unessential | in the (s, ¢) position of Z introduces 
a 0 into S, or S:. However, the trace of the matrix upon completion of this 
interchange remains at ¢ — 1. But then we may apply a third interchange 
involving rows u and s of Z and regain trace ¢. Thus in all cases there is no 
0 present in the (u,v) position of X, where u < s. This gives a matrix of 
the form (3.7) and completes the proof. 


THEOREM 3.3. The maximal trace & for the matrices in the normalized % is 
given by 


(3.11) & = min {t,, + max(e, f)} 
ef 


Let A be a matrix in & of trace ¢ with the ¢ 1's in the initial positions on 
the main diagonal. Let A be subdivided into the four blocks W, X, Y, Z of 
(2.1) with W of size e by f. Now for the matrix A under consideration it is 
clear that 


(3.12) N,(Z) > & — max(e, f) 


But Vo(W) > 0 so that 


(3.13) tery + max(e,f) = No(W) + Ni(Z) + max(e, f) > @ 
(e=0,...,m; f =0,...,m). 
Suppose that ¢ # min(m, nm). Then we may specialize our A to the 4; of 


Theorem 3.2. The submatrix S of A; is of size e by f. We may set W = S 
and obtain No(W) = 0 and N,(Z) = & — max(e, f). Thus if ¢ # min(m, n), 

















n 


of 











TRACES OF MATRICES OF ZEROS AND ONES 471 


equality is attained in (3.13) for the dimension numbers e and f of the sub- 
matrix S of A;. If ¢ = m, equality is attained in (3.13) for f = 0 and e = m. 
If ¢ = nm, equality is attained in (3.13) for e = 0 and f = n. This proves 
Theorem 3.3. 

We consider next the minimal trace ¢ for the matrices in W. 


THEOREM 3.4. Let the matrices in the normalized U have precisely u rows 
and v columns composed entirely of 1's and let ¢ # max(u,v). Then there exists 
a matrix A; of trace ¢ in U of the form 


S$ 3° 
(3.14) As ™ Se § ° 
* * 0 


Here S is a matrix of 1's of order &. S,; of size & by s and Sz of size t by are 
matrices of 1's. S is a matrix with 0's in the main diagonal positions of A; and 
l’s in all other positions. 0 is a zero matrix. (The cases s = 0 and t = 0 are not 
excluded.) 


Let A be a matrix in &. If A is not square, then add zero rows at the bottom 
or zero columns at the right and obtain a square matrix A of order max(m, n). 
In A replace the 1’s by 0’s and the 0's by 1's. This yields a matrix C called 
the complement of A. The matrix C determines a class €. Let C be a matrix 
in © of maximal trace ¢,. Evidently 


(3.15) ¢ = max(m,n) — &,. 


The matrix C has row sums and column sums in ascending order. Moreover, 
¢, ~ max(m,n) — max(u,v), for otherwise ¢ = max(u,v). We now apply 


Theorem 3.2 to the block in the lower right corner of C of size max(m, n) — u 
by max(m, n) — v. This tells us that C may be written in the form 
0 0 ° 7 
(3.16) C 0 Ab ae 
3.16) = 0 0 
* = S 
% all 








The 0’s denote zero blocks. The 0 in the upper left corner of C is of size 


u by v. S is a block of 1's of size 2 by f. 0 of size g by h has 1’s in the main 
diagonal positions of C and 0's in all other positions. Moreover, 


(3.17) é+g=fth=a.. 


Now take the complement of C and delete all zero rows or columns. This 
yields 2 matrix A; of the type described in the theorem. 











472 H. J. RYSER 


THEOREM 3.5. The minimal trace ¢ for the matrices in the normalized & is 
given by 
(3.18) ¢ = max {min(e, f) — ¢,;} 
ef 
(e=0,...,m;f =0,...,m). 
Let A be a matrix in & of trace ¢ with the ¢ 1’s in the initial positions on 
the main diagonal. Let A be subdivided into the four blocks W, X, Y, Z of 


(2.1) with W of size e by f. Now for the matrix A under consideration, it is 
clear that 


(3.19) No(W) > min(e, f) — ¢ 


SO Orr ee ee n). 
But N,;(Z) > 0 so that 


(3.20) min(e, f) — t-y = min(e, f) — (No(W) + Ni(Z)) < 
eee ed eee 


Suppose that ¢ # max(u,v), where the matrices in & have precisely u 
rows and v columns composed entirely of 1's. Then we may specialize our A 
to the A; of Theorem 3.4. The matrix A; yields a W, X, Y, Z block sub- 
division with W of size e by f for which No(W) = min(e, f) — ¢and N,(Z) = 0. 
Thus if ¢ # max(u, v), then there exists an e and an f for which equality is 
attained in (3.20). If ¢ = u, equality is attained in (3.20) for e = ¢ and 
f = n. If ¢ = v, equality is attained in (3.20) for e = m and f = 4. 


4. Trace and term rank. Let A belong to the normalized class A, and 
let 6 be the maximal term rank for the matrices in YU. The integer f is given 
explicitly by (1.5). We derive a second formula for j analogous to (3.11) 
for é. 


THEOREM 4.1. The maximal term rank p for the matrices in the normalized 
W is given by 


(4.1) Bb = min {t,, + (e+ f)} 
e.f 
(om @,..., arf = 6, ...,@). 


Let A be in the normalized & and of maximal term rank j. Let A be sub- 
divided into the four blocks W, X, Y, Z of (2.1) with W of size e by f. Now 
the term rank of a matrix equals the minimal number of rows and columns 
which contain collectively all of the non-zero elements of the matrix. Hence 
for the matrix A under consideration, it is clear that 
(4.2) N,(Z) + (e +f) >&B. 


But No(W) > 0 so that 
(4.3) tes + (e +f) = No(W) + Ni(Z) + (e +f) >. 








ice 





TRACES OF MATRICES OF ZEROS AND ONES 473 


Suppose that 6 < min(m, n). Then by Theorem 3.2 of (10) we may specialize 
our A toa matrix A; of term rank § with a W of size e by f for which Vo(W) =0 
and N,(Z) =p — (e +f). Thus if 6 + min(m, n), equality is attained in 
(4.3) for the dimension numbers e and f of the submatrix W of A;. lf Bp = m, 
equality is attained in (4.3) for f = Oand e = m. If p = n, equality is attained 
in (4.3) for e = 0 and f = n. This establishes (4.1). 

Let A belong to the normalized class %. An element a,, = 1 of A is an 
invariant 1 provided that no sequence of interchanges applied to A replaces 
a,;; = 1 by O (10). If a,, = 1 is an invariant 1 of A, then the entries in the 
(r,s) position of all of the matrices in & must be invariant 1's. Thus all or 
none of the matrices in & contains an invariant 1, and we say W% is with or 
without an invariant 1. The normalized class & is with an invariant 1 if 
and only if the matrices in & are of the form 


S . 


Here S is a matrix of 1's of size e by f (0< e< m;0 <f <n) and O isa 
zero block (10). Now the entries of the structure matrix 7 of & are non- 
negative integers. Moreover, 


(4.5) ter > 0 ly eer 2 oS ee n) 
if and only if & is without an invariant 1. Indeed, each 
(4.6) tey = 0 (e, f > 0) 


yields the dimension numbers e and f for a block decomposition of the type 
displayed in (4.4). 


Let p be the maximal and p the minimal term rank for the matrices in . 
If 5 < min(m, n) and if & is without an invariant 1, then p < p (10). But 
important classes do exist with = j, for example, the class of all (0, 1)- 
matrices of order m = n with exactly k& 1’s in each row and column. An 
unsettled problem asks for a neat classification of all & with = p. The corre- 
sponding problem for traces in a normalized &{ has an easy solution. For let 
A in & be of trace ¢ with 1’s in the initial positions on the main diagonal. Then 
if ¢ = 6é, it follows readily that 


* 
an 1-[3 1] 


Here S is the matrix of 1’s of order ¢ and 0 is a zero block. Thus a normalized 
class U% has ¢ = ¢ if and only if its structure matrix contains a zero on the 
main diagonal. 

A single interchange alters the term rank of a matrix by at most 1. It 
follows from this and the interchange theorem that there exists an A in 
of term rank p, where p is an arbitrary integer such that 


(4.8) Pp <B. 











474 H. J. RYSER 

However, a single interchange may alter the trace ¢ of a matrix in & by 2. 
This causes a complication in finding the domain of intermediate values for ¢ 
(4.9) <0 KG. 

The problem of intermediate values is settled by the following theorem. 


THEOREM 4.2. The traces of the matrices in the normalized U take on all 
integral values in the interval ¢ < o < & unless U contains a matrix of the form 


S » * 
(4.10) A=|S** [, 0 
” 0 0 


Here S is a matrix of 1's of order e, S* is a rectangular matrix of 1's, S*™ is the 
transpose of S*, I, is the identity matrix or the complement of this matrix, and 
the 0's are zero matrices. The order of I, is g with g > 2. (The cases e = 0, 
e+ g =m, and e + g = n are not excluded.) 


Two matrices in & are transformable into each other by interchanges, and 
a single interchange applied to a matrix in & may alter its trace by at most 
2. Consecutive traces of mtarices in & may differ by at most 2. Suppose then 
that ¢ and o — 2 but not o — 1 appear as the traces of matrices in &. Then 
there exists an A, in & of trace o with a principal minor of order 2 that is 
the identity. 

Thus there exists an A, in & with a principal minor of order g 
(4.11) M = [m,,] at @ Bo wall 
composed of consecutive rows and columns of A, and such that 
(4.12) My, = Mg = ® Mig = My = 0. 


We let g be maximal among all matrices in & of trace ¢ and write 


i i 
(4.13) A,=|C M F 
E A 


The first row of A, passing through M must have the same sum as the last 
row of A, passing through M, for otherwise an interchange yields a trace of 
o — 1. But & is normalized so all rows of A, passing through M have the 
same sum. Similar remarks hold for the columns of A, passing through M. 
Throughout the discussion we designate the submatrices of A, in (4.13) by 
A = {a,,), F = [71s], etc. Suppose that in F some f,, = 1. If f:, = 0, we may 
apply an interchange involving f,, = 1 and f;, = 0. This interchange cannot 
yield a trace of ¢ — 1. Nor can the interchange increase the trace to o + l, 
for then an interchange involving m,, = m,, = 1 yieods a trace of o — 1. 








ull 





TRACES OF MATRICES OF ZEROS AND ONES 475 


Hence if some f,, = 1, then there exists an A, of the form (4.13) with Fie =1. 
We may now apply an interchange involving f,;, = 1 and m,, = 0. If the trace 
remains equal to ¢, a second interchange involving m,, = 1 and m,, = 0 
yields a trace of ¢ — 1. Suppose then that the interchange involving /,, = 1 
and m,, = 0 yields a trace of ¢ + 1. Let the 1 introduced on the main diagonal 
of A, be in the (¢, ¢) position of A. Then Zu = 1, for otherwise an interchange 
yields a trace of ¢ — 1. But now by an interchange involving m,, = 9 = 1, 
we regain trace ¢ and contradict the maximality of g. Hence F = 0. A similar 
argument gives G = 0. By the maximality of g, each h,,, = 0. If some h,, = 1 
with uw #9, then an interchange involving hu, = my, = 1 yields a trace of 
o — 1. Hence A = 0. 

Suppose that some é,, = 0. The rows of A, passing through M have the 
same sum, so that if some é,, = 0, then there exists an A, of the form (4.13) 
with ¢,, = 0. But then 4,, = 1, 6,, = 0 and bv» = €,, = 0, for otherwise an 
interchange yields a trace of ¢ — 1. But this contradicts the maximality 


of g. Hence C is a matrix of 1’s. Similarly, B is a matrix of 1's. 

Suppose that some 4,,, = 0. Then an interchange involving d,, = m,, = 0 
yields a trace of ¢ + 1. Now apply an interchange involving m,, = é,, = 1. 
This regains trace ¢ and contradicts the maximality of g. Hence each 4,,, = 1. 
If some d,, = 0 with u #v, then an interchange involving d,, = m,, = 0 
retains trace ¢. A second interchange yields a trace of ¢ — 1. Hence A is a 
matrix of 1's. 

All row and column sums of M = [m,,| must be equal. Suppose there 
exist u and v such that 


(4.14) Mu, = i, M., = 0 (l <u, v < g). 


If m,, = 0 or if m,, = 1, an interchange yields a trace of ¢ — 1. Hence M 
has trace g or trace 2. Suppose M has trace g. If m,, = 1 with ¢> 1, then 
m,, = 1. An interchange involving columns ¢ and g followed by an inter- 
change involving columns 1 and ¢ yields a trace of ¢ — 1. Hence row | of M 
has sum 1 and M = J. Suppose M has trace 2. If m,, = 0 with ¢ < g, then 
an interchange involving columns 1 and ¢ yields a trace of « — 1. Hence 
row | of M has sum g — 1 and an interchange replaces M by the complement 
of J. This proves Theorem 4.2. 

It is clear that Theorem 4.2 disposes of the problem of finding the domain 
of intermediate values for the traces o of the matrices in Y%. For suppose 
that Mf contains a matrix of the form (4.10) and that J, is the identity matrix. 
Then ¢ — 1 is the single value excluded from the integers in the interval 
@ < o < &. Suppose on the other hand that & contains a matrix of the form 
(4.10) and that J, is the complement of the identity. Then ¢ + 1 is the single 
value excluded from the integers in the interval ¢ < o < 4. 











476 H. J. RYSER 


REFERENCES 


1. L. R. Ford, Jr., and D. R. Fulkerson, A simple algorithm for finding maximal network flows 
and an application to the Hitchcock problem, Can. J. Math., 9 (1957), 210-218. 





2. D. R. Fulkerson, A network-flow feasibility theorem and combinatorial applications, Can. J. 
Math., 11 (1959), 440-451. 
3. Zero-one matrices with zero trace, Rand Corporation publication P-1618. 


4. David Gale, A theorem on flows in networks, Pac. J. Math., 7 (1957), 1073-1082. 

5. R. M. Haber, Term rank of 0, 1 matrices, to appear in Ill. J. Math. 

6. Alan J. Hoffman, Some recent applications of the theory of linear inequalities to extermal 
combinatorial analysis, to appear in the American Mathematical Society publication of 
the symposium on combinatorial designs and analysis. 

Dénes Kénig, Theorie der endlichen und unendlichen Graphen (New York, 1950). 

. Oystein Ore, Graphs and matching theorems, Duke Math. J., 22 (1955), 625-639. 

. H. J. Ryser, Combinatorial properties of matrices of zeros and ones, Can. J. Math., 9 (1957), 

371-377. 
The term rank of a matrix, Can. J. Math., 10 (1958), 57-65. 


yan 





The Ohio State University 








ul 





A NEW TYPE OF CHARACTERISTIC SUBGROUP OF 
PRIME-POWER GROUPS 


H. R. BRAHANA 


1. Introduction. In a recent paper (1) the fifty-eight metabelian groups 
of order p'' that are generated by five elements and have all their elements of 
order p were determined and characterized in terms independent of any 
particular selection of the generating elements. In dealing with fifty-seven of 
these groups there was no occasion to distinguish between one odd prime and 
another, except that in exhibiting canonical forms it was necessary to select 
irreducible polynomials and these, of course, depended on p. The fifty-eighth 
group was described in two ways in terms that were independent of p, but the 
proof of uniqueness could not be made without taking into account properties 
of p. These properties distribute the primes into classes, and the properties 
are reflected in the groups of order p'' in characteristic subgroups some of 
which exist for one prime and not for another. It may be that examination of 
the groups of isomorphisms of some of the fifty-seven groups would produce 
characteristic subgroups for one p that would not exist for another, but the 
writer considers it doubtful. The doubt is made plausible by the fact that 
examination of some of the likeliest groups yielded no such subgroups, and by 
the belief that if a group is described and a canonical form obtained without 
making use of any special property of the prime then anything that is true for 
a group with one p will have an analogue for one with another. The fifty- 
eighth group has some characteristic subgroups pointed to by geometric 
differences appearing with different types of primes. It is believed this pheno- 
menon of prime-power groups has not been brought to light before. 


2. The groups. The following properties determine a group G of order 
pb” for every value of p: 

1. Elements are all, except identity, of order p; 
. The group is metabelian; 
3. Central and commutator subgroup coincide;' 
4. The group has five generators. 
The groups of order p"' are obtained by adding four conditions on commutators 
of this group, on commutators only because a condition that contained one 
of the generators explicitly would give a group that would not satisfy 3. Any 
four independent conditions on commutators will give a group of order p"', in 
certain well-defined cases again violating 3. 

Received June 15, 1959. 

‘Of course 3 includes 2. 











478 H. R. BRAHANA 


To get the group we are seeking we require the four conditions to be such 
that G of order p"' satisfy: 

5. G contains no abelian subgroup of order p*; 

6. G contains no subgroup of order p'® whose commutator subgroup is of 

order p‘*. 
G satisfying these conditions exists for every p and it is unique. In establishing 
the uniqueness it is necessary to make use of different arguments for different 
p’s and this points to different characteristic subgroups of G. 
Groups of order p"' satisfying 1,..., 5 also satisfy 6 or 
6’. G contains one subgroup of order p'® whose commutator subgroup is of 
order p*. 
For purposes of comparison we shall consider this group too. Looked at geo- 
metrically the situation is not so mysterious. 

Let U,, Us, Us, Us, Us be generators of the group G of order p"; let c,, be 
the commutator of U;, and U,; and let C be the group generated by the c;,,. 
Every element of G is 

¢ UT UF UP UF UF 
where c is in C and x, is an element in GF(p). The set (x1, x2, X3, X4, X53) may 
be taken to be a point in a projective four-space X over GF (p). Then every 
element of G not in C determines a point in X; every point in X represents a 
cyclic subgroup of G/C and also an abelian subgroup of order p"' of G. The 
commutator of two elements 


c Ui’... Us and CUT. 


does not depend on ¢ or c’; it is a product of the c;,’s; it can be represented by 
a point in a projective nine-space S, over GF (p), which has for co-ordinates 
the Pliicker line-co-ordinates of the line on the two points x and y in X. The 
points of S which represent commutators belong to V, which is a V;,° corres- 
ponding to the grassmannian of lines in X. Every point P of S not on V 
determines a three-space R in X, and the lines of R determine a five-space = 
in S, a = intersects V in a V,’; there is only one = in S, determined by an R 
in X, which contains P. A line in S which lies in a © is called a  -line; its 
points not on V all determine the same R in X. A line in S not a 2-line and 
not intersecting V determines a unique point M on V such that the plane on M 
and the line is tangent to V at M. The space tangent to V at M, that is, the 
space consisting of all points P in S such that PM isa ruling of V or such that 
the five-space = which contains P contains M and PM meets V only at M, 
is six-dimensional; any plane in such a tangent space is called a r-plane. 
The four conditions on commutators which reduces G of order p" to G of 
order p" set four independent elements of C equal to identity, and hence set 
equal to identity all the elements in the group generated by the four. These 
four elements of C are represented by four independent points of S, and the 
points determine a three-space S; in S. The group G is determined by the 











of 


O- 


es 








A NEW CHARACTERISTIC SUBGROUP 479 


relation of S; to V. When 5 is satisfied, S; has no point on V; when 5 and 6 
are satisfied, S; contains no 2-line; when 5 and 6’ are satisfied, S; contains one 
y-line. These groups exist, and they are the only ones when 5 is satisfied. 

The condition that S; intersect V leads to a fifth-degree congruence, f(x) = 0, 
mod. p. If f(x) has no linear factor in GF (p), condition 5 is satisfied; conditions 
6 and 6’ correspond respectively to f(x) irreducible and f(x) the product of an 
irreducible quadratic and an irreducible cubic. In the latter case the quadratic 
is connected with the =-line and the cubic with a r-plane, both unique. 


3. The characteristic subgroups. A point P in S not on V determines 
a three-space R in X, and R determines in G a subgroup AT of index p. This 
subgroup is the direct product of a metabelian group of order p'® generated 
by four elements and an abelian group of order p*. When G is reduced to G 
by setting equal to identity the elements of C which correspond to points of 
S; H becomes a group H of order p'® and its commutator subgroup remains 
of order p* if P is not in the five-space = determined by a point of S;; the 
commutator subgroup of H will be of order p* if P is in the = determined by 
a point of S; not on a =-line of S;; this commutator subgroup will be of order 
p‘ if P is in the = determined by a point of the Z-line in Ss. 

When 6’ is satisfied, the unique 2-line in S; means that there is one and only 
one three-space R in X whose = in S intersects S; in a line. Hence G contains 
one subgroup only of order p'® with commutator subgroup of order p*; the 
subgroup is therefore characteristic. 

The r-plane x, which is in S; when 6’ is satisfied, contains no 2-line. Hence 
every point of # determines a subgroup of order p'® in G, whose commutator 
subgroup has order p*, except for the point where 7 intersects the ~-line. 
The r-plane is in the space tangent to V at a point M, and from this follows 
that M is in the five-space = determined by each point of r. M is the image 
on V of a line m in X, and m lies in every R determined by a point of x. m deter- 
mines in G a subgroup H,, of order p* which is non-abelian since M is not in 
S;. The group H,, is characterized by the fact that it is the only group of order 
p® that is contained in every one of the 1 + p + p? subgroups of order p'® 
determined by the points of x. H,, is therefore characteristic in G; it is in the 
characteristic subgroup of order p'® determined by the 2-line. 

These characteristic subgroups of orders p* and p'® together with conditions 
1,...,5 are enough to determine G, and they exist for every odd p. The 
subgroups are seen to be characteristic because they are uniquely defined. 
G has other characteristic subgroups which in number depend on p, but only 
on the size of ». The points of the r-plane determine subgroups of order p'® 
of G. The vertices of the frame of reference in S; are completely determined 
if S; is in canonical form. (1, p. 699). The group of isomorphisms of G deter- 
mines a group of collineations of X and this group leaves invariant every point 
of the r-plane and it leaves invariant two points of the =-line, viz., its inter- 
section with x and the conjugate of that point with respect to the quadratic 











480 H. R. BRAHANA 


intersection of V and the five-space = which contains the Z-line. Thus every 
subgroup of order p'® determined by the points of x is characteristic. 

It may be verified readily that the group of collineations of X induced by 
the group of isomorphisms of G is of order 2, and that the isomorphisms of G 
which induce the identity collineation constitute a group of order (p — 1)p”, 
the p — 1 coming from replacing each L’; by its kth power, k = 1,2,...,p —1 
and the p** from replacing U, by c,U,; where the c,’s are arbitrary, independent 
elements of C. The order of the group of isomorphisms is therefore 2(p — 1) p**. 

When 6 is satisfied S; contains no special line and no special plane. The rela- 
tion of S; to V determines in S; p? + 1 “rational’’ cubic curves, one and only 
one through each point (1, pp. 704, 715-716). The group of collineations of X 
which transforms S; into itself is induced by the Galois group [ of GF (p*) 
relative to GF (p) and hence is of order 5. [ transforms the cubics in sets of 
five and hence will leave invariant a number congruent to p? + 1, mod 5; 
and if a cubic is invariant it will contain p + 1, mod 5, invariant points. 

The relation of S; to V, by which a point P determines a three-space R in X, 
serves to determine for any point A in R a quadric surface in S; which passes 
through P; by the same relation a point A in X but not in R determines a 
quadric in S; which does not pass through P. Thus the points of X determine 
in S; a four-parameter set W of quadrics. In X there is a locus J, of dimension 
three and order four, whose points determine in S; cones with one vertex; 
every point of S; is the vertex of one and only one such cone of the set W. 
Thus each point of S; determines a point in X, as well as the three-space R. 
Each point of J, in X, determines a plane o in X which is the double tangent 
plane of J at the point, and intersects J in a conic C whose points determine the 
cones with vertices on one of the p? + 1 cubics; each of these » + 1 cones 
contains the cubic curve. Two of the planes o intersect in a point which is 
not on J, and this point determines a non-degenerate ruled quadric which 
bears the two cubics corresponding to the two planes. Moreover, every point 
of a plane ¢ not on C is on another o. 

When p = 5t + 1, then both p? + | and p + | are congruent to 2, nod 5, 
and hence 5S; contains four points fixed under ['. These four points determine 
four fixed points on J, the points which determine cones with vertices at the 
fixed points of S;. Moreover, X contains a fifth fixed point, the point not on J 
which determines the non-degenerate ruled quadric containing the two fixed 
cubics. These five fixed points in X determine five abelian subgroups of order 
p’ in G, and these subgroups are characteristic. They are contained in sets of 
two, three, four in subgroups of order p*, p®, and p'®, respectively, necessarily 
characteristic also. 

This use of the five fixed points indiscriminately does not make full use of 
the geometry. Four of the fixed points in X are on J and one is not. Let the 
fixed point not on J be A». Ae is on o; and ae, the double tangent planes of J 
determined by the fixed cubics K; and Kz in S3. In o; and a2 are conics C; and 
C2, intersections of the planes with J. C; and C2 are fixed under [ and so are 








ery 


S. 
i X, 
sses 
es a 
nine 
sion 
tex; 

W. 
e R. 
sent 
» the 
ones 
h is 
hich 


oint 


d 5, 
nine 

the 
on J 
ixed 
rder 
‘s of 
irily 


se of 
the 
of J 
and 
) are 











A NEW CHARACTERISTIC SUBGROUP 481 


the polars /; and /, of A» with respect to C,; and C,. The other four fixed points, 
necessarily on J, are A; and A; on /; and C;, and A; and A, on /, and C3. 

Distinctions can be made among the ten characteristic subgroups of order 
p® of G. We recall that each of the 1 + p + p? + p* points of S; determines a 
unique three-space R of X and a subgroup of order p'® of G whose commutator 
subgroup has order p*. A line in S; determines a line in X and also a set of 
pb + 1 three-spaces in X, and a set of » + 1 subgroups of order p'® in G. Of 
the ten lines in X on pairs of the five points fixed under I, six are imaged on 
points of V at which the spaces tangent to V cut S; in lines; the remaining 
four do not have this property. Thus each of six of the characteristic sub- 
groups of order p* is in p + 1 subgroups of order p'® whose commutator 
subgroups are of order p*; each of the other four characteristic subgroups of 
order p* is in only one such group of order p'”. 

Similar distinctions can be made among characteristic subgroups of orders 
p* and p'°; we will let one further example suffice. Every one of the character- 
istic subgroups of order p'® contains six of the characteristic subgroups of 
order p*. The one given by A; A; A, A; contains the four subgroups of order 
p* described at the end of the last paragraph; each of the other four character- 
istic subgroups of order p'® contains only two of these. 

When p = 5t — 1, S; contains two cubics fixed under I, but contains no 
fixed points. The cubics determine the planes o; and o2 in X, and the inter- 
section of o; and a2 is a fixed point A». A, determines a characteristic subgroup 
of order p’ which is abelian and the only characteristic subgroup of its order. 
o, and gz contain conics C,; and C2, on J, and the polars /,; and /, of A» with 
respect to them. /, and /, determine characteristic subgroups of order p* of 
G. Az with /; and J, separately determines characteristic subgroups of order 
p*®, and /,; and /, determine a characteristic subgroup of order p"°. 

When p = 5t + 2, S; contains no fixed cubic and no fixed point. However, 
1+ p+ p?+ p*+ p* =1, mod 5, and hence X contains one fixed point and 
one fixed three-space. Thus G contains one characteristic subgroup of each 
of orders p’ and p'®; G contains no characteristic subgroup of order p* or p’. 

Thus a group of order p'' satisfying 1, ... , 6 has characteristic subgroups 
in numbers and orders as follows: 

p' p*® p*® p'® 
= 5t+1 5 10 10 5 
5t — 1 l 2 2 l 
= 5¢+2 l 0 0 l 


p 
p 
P 


4. Concluding remarks. The study of finite groups conformal with the 
abelian groups of orders p" and type 1, 1,..., 1 immediately singles out the 
prime 2, since no such non-abelian group exists for p = 2. The maximum value 
of the class of a group whose elements are all of order » depends on the size 
of p, and so differentiates among primes; when the groups are restricted to be 
metabelian, that is, of class 2, this last distinction is lost. When the metabelian 











482 H. R. BRAHANA 


groups are ordered according to the number of generators, there is no occasion 
to distinguish one odd prime from another until this group of order p'' is 
reached. Because of a duality in the geometry, the determination in the paper 
cited of all the groups of order p*, a > 11, brings with it a determination of 
all of those for a < 9. There remain to be examined the groups of order p"®. 
It is certain that many of the groups of order p'® will require different treatment 
for different primes. 


REFERENCE 


1. H. R. Brahana, Metabelian p-groups with five generators and orders p* and p", Illinois J. 
Math., 2 (1958), 641-717. 


University of Illinois 











on 











A THEOREM ON PURE SUBMODULES 
GEORGE KOLETTIS, Jr. 


1. Introduction. In (1) Baer studied the following problem: If a 
torsion-free abelian group G is a direct sum of groups of rank one, is every 
direct summand of G also a direct sum of groups of rank one? For groups 
satisfying a certain chain condition, Baer gave a solution. Kulikov, in (3), 
supplied an affirmative answer, assuming only that G is countable. In a 
recent paper (2), Kaplansky settles the issue by reducing the general case 
to the countable case where Kulikov’s solution is applicable. As usual, the 
result extends to modules over a principal ideal ring R (commutative with 
unit, no divisors of zero, every ideal principal). 

The object of this paper is to carry out a similar investigation for pure 
submodules, a somewhat larger class of submodules than the class of direct 
summands. We ask: if the torsion-free R-module M is a direct sum of modules 
of rank one, is every pure submodule N of M also a direct sum of modules 
of rank one? Unlike the situation for direct summands, here the answer 
depends heavily on the ring R. If R is a field, there is no problem, and if R 
is a discrete valuation ring (one prime up to unit factors), it is easy to see 
that the answer is still yes. On the other hand, for abelian groups, or generally 
whenever R has an infinite number of primes, the question has a negative 
answer. 

We fill in the gap by showing that if R has exactly two primes, an affirmative 
answer is obtained provided N has finite rank. If N has infinite rank or if 
R has three or more primes, examples are given showing that V need not be 
a direct sum of modules of rank one. In contrast to the large number of 
theorems on principal ideal rings with one prime, this appears to be the 
first result true specifically for rings with two primes. 


2. Preliminaries. Let R be a principal ideal ring and K its quotient 
field. The unit of R is always assumed to act as unit operator on every R- 
module. We recall that a submodule N of an R-module M is pure if aN = 
N\aM for every a in R. M is torsion-free if for a in R, x in M, and ax = 0, 
we have either a = Oorx = 0. In this case, the intersection of pure submodules 
is pure, and so every subset of M generates a unique pure submodule. The 
rank of M is the cardinal number of a maximal set of linearly independent 
(over R) elements of M, or equivalently, the dimension of the K-vector 


space K @, M. 





Received March 4, 1959. The author wishes to thank Professor Kaplansky for suggesting 
this problem. This research was supported in part by the Office of Naval Research. 


483 











484 GEORGE KOLETTIS, JR 


The torsion-free R-modules of rank one are (up to isomorphism) the sub- 
modules of the R-module K. Two such submodules M, and M; are isomorphic 
if and only if M, = aM; for some a in K. In particular, M, is free precisely 
when 4, = aR for some a in K. For each prime p in R, we denote by R, 
the submodule of K consisting of those elements which can be written with 
a denominator prime to p. 

Let the torsion-free module M be a direct sum M = 2M,, i ranging over 
an index set, each M, of rank one. Let V be a pure submodule of M. We 
note that we can for our purpose confine ourselves to the case where none of 
the summands M, is free or divisible. 

Indeed, write M = M’ @ F, where F is the sum of the free M,'s and M’ 
the sum of the remaining M,'s. N/(M’(\ N) = (M’ + N)/M’ is a sub- 
module of M/M’, which is free. Thus M’()\ N is a direct summand of NV 
whose complementary summand is free. It follows that N is a direct sum of 
modules of rank one whenever M’ (\ V, a pure submodule of M’, is a direct 
sum of modules of rank one. 

Next, write M = D @ M”, where D is the sum of all the divisible M,'s 
and M” the sum of the remaining M,'s. The purity of V and the divisibility 
of D combine to yield the divisibility of V (\ D. Thus NV (\ D is a direct 
summand of V. The complementary summand V/(N (\ D) = (NV + D)/D 
is a submodule of M/D = M”. For any a in R, by the divisibility of D, 
we have D = aD C aM. The modular law then gives aM (\(N + D) = 
(aM (\ N) + D, which, since N is pure, is just aN + D. Modulo D this 
becomes (aM/D) (\ ((N + D)/D) = a(N + D)/D, which is exactly the 
assertion that (V + D)/D is pure in M/D. So N is a direct sum of modules 
of rank one whenever V/(.V (\ D), which can be regarded as a pure submodule 
of M”, is a direct sum of modules of rank one. 

In conciusion, we remark that if R has just one prime, every rank one 
module is either free or divisible, and the above reductions are all that are 
needed to show that N is a direct sum of modules of rank one. 


3. R with two primes. Throughout this section, we assume that R has 
exactly two primes (up to unit factors). Denote them by p and g. The quotient 
field K is the set of ali fractions a/(p"g"), a in R, m, n 2 0. The submodules 
of K fall into four classes according as they do or do not contain unbounded 
powers of p, and of g, in the denominators of their elements (when these are 
written in “lowest terms’’). Using this classification it is easily seen that every 
submodule of K is isomorphic to one of R, R,, R,, or K. Thus these are the 
modules of rank one. 

Now for the theorem: 


THEOREM. Let the torsion-free module M be a direct sum of modules of rank 
one. Then every pure submodule of finite rank is also a direct sum of modules 
of rank one. 








p 


tk 





A THEOREM ON PURE SUBMODULES 485 


Proof. We may suppose that M has finite rank and that each rank one 
summand is either a copy of R, or of R,. Write M = P @ Q where P is a 
direct sum of copies of R,, and Q of copies of R,. Choose elements u;,..., %, 
in M so that P = Ryu, @... @ Ru, and Q = Ru, @... ® Ru, where 
Oss St. 


Assume that every pure submodule of rank n — 1 (m 2 2) is a direct sum 


of modules of rank one, and let N be a pure submodule of rank n. For every 
k with 1 S k St, let N, be the intersection of N with the direct sum of all 
the rank one summands except the kth one. Then .V, is a pure submodule of M 
whose rank is m — 1 or m depending on whether or not there is an element 
of N having a non-zero kth component. It will be sufficient to show that at 
least one of the NV, of rank m — 1 is a direct summand of N, for such an NV, 
is a direct sum of modules of rank one whose complementary summand is 
of rank one. 

There is no loss in generality in assuming that Q # 0 and that V, # V. 
We consider two cases: 


Case I. NVQ =0. 


Since V, # NV, N, is of rank n — 1. We will prove that .V, is a direct sum- 
mand of N by showing that V/V, is free. To do this, we need only show 
that when the elements of V are expressed in terms of the u,'s there is an 
upper bound to the powers of p that can occur as denominators in the co- 
efficients of u,. 


Let x:,...,2X, be a maximal independent subset of V. If x # 0 is in NV, 
some non-zero multiple of x, say rx, lies in the module generated by the 
xjs. If rx = ryx, +... + TX,, we can clearly suppose that not every one 
of r,71,...,%, is a multiple of p in R. 


Assume that V/JN, is not free, and let m be a given positive integer. Then 
we car choose the element x so that, in the expressions for x and the x,'s in 
terms of the u,’s, the coefficient of u, for x has a power of p in its denominator 
so large in comparison to those for the x,'s so as to require r to be a multiple 
of p” in R. 

Using primes to denote images in M/Q=>P, we observe that since 
N (\Q = 0, the elements x;’,..., x,’ are independent. Say x’ = cyw);' +... 
+ cu,’ and xj = Cyt; +... + ¢;,u,' where all the c’s are in R,. The m X s 
matrix (c,,;) thus obtained has all its rows independent. Say the first columns 
are also independent, and let D # 0 be the determinant of the m X n sub- 
matrix (c;,), 1 S1,j Sn. 

We have the following system of s equations: 


10, = 11014 +... HM nCnt- 


From the first ” of these, and the fact that 7 is in p"R, we see that r,D is 
in p"R, for each j7. Hence D is in p”R,. 











486 GEORGE KOLETTIS, JR 


Thus the assumption that V/N, is not free requires D to be in (),,p"R, = 0, 
a contradiction. 


Case II. N(\Q #0. 


Let x ~ 0 be an element of NV (\Q, say x = Dygittygi1 +... + Buy, each 
b, in R,. The result of dividing x by the largest power of g common to all of 
the 5,'s will again be an element of N (N is pure), and so we may as well 
assume that at least one of the 5,'s, say , is not in gR,. 

The submodule 2, is contained in N since N is pure. We have V = NV, @R,x. 
For since , ~ 0, N (\ Rx = 0. On the other hand suppose w in N has 6*,% 
as its kth component where 6*, is in R,. Since & is not in gR,, b:—' is in R,. 
Hence w — 5,*b,—'x is in N,. This shows that VN = N, + Rx. 


4. Examples. Let R be an arbitrary principal ideal ring and M a torsion- 

free R-module. One readily verifies that for every prime p in R, 

© 

1\ p’M 

j=l 
is a pure submodule of M. Since it is clear that a torsion-free module of rank 
one has no proper pure submodules, we see that if M is a direct sum 2M, of 
modules of rank one, the submodule 


x 

1) ~’M 

j=l 
is the sum of those rank one summands M, for which pM, = M, and is 
therefore a direct summand of M. This gives a necessary condition for a 


module to be a direct sum of modules of rank one. 

Using this condition, we give two examples. The first example shows that 
in the theorem the hypothesis of finite rank is indispensable. The second 
example shows that the theorem cannot survive the presence of three primes. 


Example 1. We assume again that R has just two primes, p and g. Let 
M = P @Q, where P = R,u is a copy of R, and Q is a direct sum 


> Ryu; 


i=1 


of an infinite number of copies of R,. Let N be the pure submodule generated 
by all the elements (1/g‘)uo — u;. We will show that N is not a direct sum 
of modules of rank one. 

First, we note that V (\ P = 0. Indeed, an element of P will lie in N only 
if some non-zero multiple of it lies in the module generated by the elements 
(1/q')uo — u;. Clearly, for a; in R, a sum 


can only be in P if each a; = 0. 








1k 





A THEOREM ON PURE SUBMODULES 487 


Next, we note that since every element (1/g‘)uo lies in N + Q, we have 
M=N+Q. 


Since N is pure, 


1) p’N 
j=l 
is V(\Q. Now 
N/(N C\Q) = (N + Q)/O=R, = @R,. 


Any submodule Z of M for which gL = L must be contained in P. Thus a 
complementary summand for N (\ Q in N must be contained in NV (\ P = 0. 
Since it is clear that V (\ Q # N, we conclude that N ()\ Q is not a direct 
summand of N and that N is not a direct sum of modules of rank one. 


Example 2. Let R have at least three non-associated primes. Say p, g, and 
r are three of them. Let M = R,u,; ® R,u2 @ R,u; be the direct sum of a 
copy each of R,, R,, and R,. Let N be the pure submodule generated by 
u; — U2 and uy — u3. It is immediate that V (\ R,u, = 0. 

N contains all the elements (1/p*)(u2 — us), (1/g™)(ui — us), 
(1/r")(u, — ue). If a and 6 are elements of R for which ag” + br" = 1, 


and 


b a l 
” (uy — us) + (ui — U2) — qr" uy 


lies in R ue + R,u3. This shows that all elements of the form (1/g"r")u;, lie 
in NV + Rite + R,u;. It follows that M = N + Riue + Rus. 


As in the first example, 


() p’N = N () (Rate + Ru3) 
y=l 
is not a direct summand of N. For 


N (N C\ (R,u2 + R,u;)) = R, = qrR,, 


and a submodule L of N for which grL = L must be contained in R,u,; (\ NV =0. 


REFERENCES 


1. R. Baer, Abelian groups without elements of finite order, Duke Math. J., 3 (1937), 68-122. 

2. 1. Kaplansky, Projective modules, Ann. Math., 68 (1958), 372-377. 

3. L. Kulikov, On direct decompositions of groups, Ukrain. Mat. Z., 4 (1952), 230-275, 347-372 
(Russian) = Amer. Math. Soc. Translations, Ser. 2, 2 (1956), 23-87. 


University of Notre Dame 











NODAL NON-COMMUTATIVE JORDAN ALGEBRAS 
LOUIS. A. KOKORIS 


1. Introduction. A finite dimensional power-associative algebra % with 
a unity element 1 over a field § is called a nodal algebra by Schafer (7) if 
every element of & has the form al + z where a is in §, z is nilpotent, and if 
W does not have the form A = Fl + N with M a nil subalgebra of A. An 
algebra & is called a non-commutative Jordan algebra if & is flexible and A+ 
is a Jordan algebra. Some examples of nodal non-commutative Jordan algebras 
were given in (5) and it was proved in (6) that if & is a simple nodal non- 
commutative Jordan algebra of characteristic not 2, then U* is associative. In 
this paper we describe all simple nodal non-commutative Jordan algebras of 
characteristic not 2. Any such algebra has the form & = Fl + N with 
N+ = Flxi, ...,x,] for some m where the generators are all nilpotent of index 
p. The x; can be selected so that xx, = a,,;l + w,, for w,, in N and a,, in F 
such that, for each 1, some a;; # 0. Moreover, the multiplication table of 4 
is given by 


* [x4 x5] 


af a 
(1) f(xr, ..., Xe)e(x1,.--,%e) = fee +} p> ra oe 


where the dot product a-b = 4(ab + ba) is the product of A+ and [x;,, x,] = 
Ny — Key. 

The author would like to express his great indebtedness to R. D. Schafer 
for finding errors in the original manuscript and for showing how they could 
be corrected. 


2. Properties of A+. If D is the derivation algebra of an algebra %, then 
Albert in (1) calls 8 D-simple if there exists no ideal IM, other than % or 0, 
such that mD is in M for every m in M and D in D. We use a result of Harper 
(2) which for our purposes may be stated as follows. 


THEOREM 1. (Harper) Let B be a commuiative associative algebra with a 
unity quantity 1 over a field § and let B have the form B = F¥1 + N with N 
the radical of 8. Also let 8 be D-simple where D is any set of derivations on B. 
Then N = Flx1,...,%Xn] for some n where the generators x, have index p, p 
the characteristic of §. 


We remark that it is known that a D-simple algebra cannot have charac- 
teristic zero and Schafer has shown in (7) that a nodal non-commutative 


Received February 17, 1958; in revised form August 31, 1959. Presented to the American 
Mathematical Society with the title Nodal flexible associative-admissible algebras on November 
29, 1957. 


488 








ia at 











NODAL NON-COMMUTATIVE JORDAN ALGEBRAS 489 


Jordan algebra cannot have characteristic zero. He also uses a theorem of 
Jacobson (4) to prove that 9+ is a subalgebra of A+ for any nodal non-com- 
mutative Jordan algebra. 


THEOREM 2. Let U be a simple nodal non-commutative Jordan algebra over a 
field § whose characteristic is not 2. Let D be the derivation algebra of AU. Then 
W+ is D-simple. 


Suppose W%* is not D-simple. Then there is an ideal B of W* such that 
SD C B. We shall show that % is then an ideal of UW, contradicting the 
fact that & is simple. The mapping 5D = [b, c] where c is any element of 
and [b,c] = bc — cb is a derivation of A+. This is so because (a-b)D = aD-b 
+a-bD if and only if [a-b, c] = [a, c]-b + a-[d, c] and the last identity follows 
from (ab)c + (cb)a = a(bc) + c(ba), the linearized form of the flexible law 
(ab)a = a(ba). Now let 5 be in % and a in WY. Since B is a D-ideal of A+, 
bD = [b, a] is in B. Also, since % is an ideal of A+, a-b is in B. Then ba — abd 
and ab + ba in % imply abd and ba are in B. That is, % is an ideal of 4. 


Coro.Liary. Jf A = Fl + NM is a simple nodal non-commutative Jordan 
algebra over a field § whose characteristic is not 2, then N+ = Flxi,...,: X,| for 
some n, where x? = 0, x?! # 0. Thus, U has order p". 


3. The multiplication table of %. Assume that & is simple so that, by 


the corollary above, A+ = F[l,x1,..., x,] with x? = 0. In (3), Jacobson 
has shown that if D is any derivation on A*, then 
ts] 
fD » > Ox as 


for any f in Y+ and for a; in A+. The a; of course depend on the derivation 
D. lf g is any element of U*+, we have seen that the mapping {D = [/, g] is a 
derivation of &+. Hence 


fD = If, gl = > ax, 718): 
To evaluate the a;(g), we note that x,D = [x,, g] = a;,(g) and 
[g, x.] = »» je asta). 
Since [x,, 2] = — [g, xi], 
a.(g) = — p> Fe ase.) 


and since [x,, x;] = a,(x,), it follows that 











490 LOUIS A. KOKORIS 


THEOREM 3. Jf & is a simple algebra, then for any f, g in U, 


fe=fet+s > 2% 


ot ax, ax, x3]. 


This result. follows from the above formula for [f,g] and the fact that 
fe =f-¢ +41, g]. The assumption that & is nodal implies that at least one 
of the [x,, x;] is not in NR. This is equivalent to the statement that for some 
i, j, Xx, is not in N. 


THEOREM 4. The generators x, . . . , X, can be selected so that x x, = ail + w,,; 
with w,,in N and a,; in § such that, for each i, some a,, # 0. 

Let It be the vector space with x,,...,: x, as a basis. If we write a;, = 
a(x,, x,) then x0, = 2xy-xy — xXyXy = — ayy — Wiy + 2X,-x, together with the 
fact that x,-x, is in N, implies that a(x,,x,;) = — a(x,, x,). Therefore a(x,, x,) 
is a skew-symmetric bilinear form on QM. If the rank of the form is 2r, there 
exists a basis x;’,...,x,’ such that we have the canonical form 

a(x’, Xi47') =l=- a(x i4,', x7) 


for i < r, a(x,/,x,/) = 0 for all other pairs i, 7. Next take x,’ = x,’ for i < 2r 
P| 
and x,’ = x; + x fort > 2r. Then, if i < 7, a(x)", x44,") = a(x, X,') = 1; 
if r <2 < 2r, 


a(x, Seer’) = ala, Xe-7) = — (4-2, XG-nNar) = — 1; 
and if ¢ > 27, a(x", Xr41"") = a(x + x1’, X74!) = a(x1', X41’) = 1. The basis 
x1"’,...,X,"" of M has the properties stated in Theorem 4. 


4. Construction of algebras. Let § be any field of characteristic p ¥ 2. 


Define A+ by At = Fl + N+ where N+ = Flxi,...,x,] with x,...,: Xn 
nilpotent generators of index p. That is, U%* consists of elements al + 2 
where a is in F, 1 is the unity quantity of U*, and zis a polynomial in x1, . . . , Xp. 
Define the algebra UX = F1 + MN to be the same vector space as A+ and to 
have a product defined by xx, = a,,l + w,, for any ay; = — ay, in § and 


and wy, = 2x;-x,; — w,, in N, « < j. Further define 


fe=fetD 2.8 


. *|Xi, X 
4 OX, OX, [x4 x3) 


for f,g any elements in W. 


THEOREM 5. If at least one a,, # 0, the algebra U described above is a nodal 
non-commutative Jordan algebra. 


Linearization of the flexible law (fg)f = f(gf) yields the identity(fg)h + (hg)f 
= f(gh) + h(gf). Add (gf)h + (gh)f to both sides of the equality to obtain 


(2) (f-g)h + (g-h)f = (gf)-h + (gh)-f. 








sis 








NODAL NON-COMMUTATIVE JORDAN ALGEBRAS 491 
Since & has characteristic ~ 2, flexibility is equivalent to identity (2). The 
expression 
gf-h + gh-f — (g-h)f — (f-g)h 


=fehtsd 7 7 


[xs xs)-h + f-g-h 


dg oh 
+ 4 > = ax, [x x]-f — f-g-h 


O(g-h) of 
aa 2 a. me nt , "a _ *“g- 
2 p> Ox; Ox; [x4 x) f g h 


— 4 > Ww). a te, x 


7 OX, OX; 


Using 


O(a-b) _ ob +a: 2 
Ox 


the above expression becomes 


+> peu e)-( 2-25 4 Og oh f 


Ox; Ox, Ox; Ox, ~ 
og of oh af Of ah og ah f) 
~ Ox, Ox, ax, ax, =~ dx, Ox, = Ox, Ox, 


‘ — - F - X.%). 
2 eu l( Ox, Ox, Ox, Ox ‘8 


tJ 


ll 


f-g:h — (hf)-g+f-g-h — (fh)-g = 
as desried. The algebra is nodal since at least one a,, is not zero. 

The proof of Theorem 4 depends only on & having the form as described 
at the beginning of this section and it is not necessary for YU to be simple in 
order to obtain the result of Theorem 4. Thus we may assume that the 
generators x;,...,%X, have the properties of Theorem 4 and that we have 
the associated bilinear form of rank 2r. 


THEOREM 6. If mn = 2r, then U is simple. 


Suppose % is a proper ideal of YU. Then there exists a polynomial f = f(x,,..., 
x,) in B with least possible degree ¢ in x1, ... , 3 x,. Since n = 2r, a;; = 0 except 
for the following: a; 4; = 1 fori <r; and a; ;-, = — lforr <i < 2r. Then 
for each i there exists a k such that a,, # 0 but a, = 0 for all 7 # i. Then 
for this é, 


0 
= > Oy of + terms of degree > t = ay; of + terms of degree > 1. 
j . Ox; Ox ; 











492 LOUIS A. KOKORIS 


Therefore, if any monomial of f of degree ¢ has a power x, as a factor, x;f is 
a polynomial of degree ¢ — 1. The fact that f is in 8 implies that xf is in B 
and this contradicts the assumption that f has minimal degree ¢. 

If m > 2r, A is not necessarily simple. For example, consider x; — x2,4; 
which has the property that (x; — x24:))% CR. Then B = (x; — x2,41)-W 
is an ideal of © if 
[(x1 — Xarg1)-glf = (1 — Xarg1)-g-f + 4 p> Sly eed e) 8 x;] 

5 | 


i,j Ox, 


= (%1 — Xon41)-g-f + 


tl 


of 
L # ax, * [x4 _ X2r+1) x5] 


og Of 
op i > SL. X 4, Xy]+ (x4 —_ Xor41) 
t j 


is in B for every g and f in A. This will be so if [x; — x2,4;, x,] is in B for 


every j. This can be accomplished by setting x\x, = x; = x,;-x, and 
NorgiXy = Xorg. = Xopga'X,y. Then [x; — X2,41,x,] = 0 is certainly in B for 
every j. 


It seems clear that whether or not W& is simple with m > 2r depends on the 
nature of the nilpotent elements wy. 


REFERENCES 


1. A. A. Albert, On commutative power-associative algebras of degree two, Trans. Amer. Math 
Soc., 74 (1953), 323-343. 

2. L. R. Harper, Some properties of partially stable algebras, University of Chicago Ph.D 
dissertation. 

3. N. Jacobson, Classes of restricted Lie algebras of characteristic p. II, Duke Math. J., 10 (1943), 


107-121. 

4. — A theorem on the structure of Jordan algebras, Proc. Nat. Acad. Sci. U.S.A., 42 (1956), 
140-147. 

5. L. A. Kokoris, Some nodal noncommutative Jordan algebras, Proc. Amer. Math. Soc., 9 (1958), 
164-166. 

6. ———— Simple nodal noncommutative Jordan algebras. Proc. Amer. Math. Soc., 9 (1958), 


652-654. 
7. R. D. Schafer, On noncommutative Jordan algebras, Proc. Amer. Math. Soc., 9 (1958), 
110-117. 


Illinois Institute of Technology 

















ith. 








— 


GENERALIZED LIE ELEMENTS 
RIMHAK REE 


Introduction. Let X(ij), i,j = 1,2,...,m, be m*® elements in a field K 
of characteristic zero such that A(ij)A(ji) = 1 for all i and j, and x, x2,...,: Sn 
non-commutative associative indeterminates over K. Define the elements 
[xX y....X,,| inductively by [x,] = x; and 


n 
[xoXeq --- Xe) = Xu lta--- Lal — [] ACG )ixu..- talta- 
Any linear combination of the elements 


Ci ae oe 


with coefficients in K will be called a generalized Lie elememt. Generalized 
Lie elements reduce to ordinary Lie elements if A(ij) = 1 for all ¢ and j. 

The purpose of this paper is to generalize to the generalized Lie elements 
the following: a theorem of Friedrichs, a theorem of Dynkin-Specht-Wever 
(2), and the Witt formula on the dimension of the space spanned by homo- 
geneous Lie elements of a fixed degree. The set of all generalized Lie elements 
will be made into an algebra which generalizes the ordinary free Lie algebra. 
This algebra turns out to be free in a certain sense. We shall also generalize 
the algebra associated with shuffles in (2).' 


1. Generalized Lie algebras. Throughout this paper A will denote a 
field of characteristic zero. By a bi-character in K of an additively written 
abelian semi-group M we shall mean a map x: M X M—K satisfying the 
following: 


x(p,0 + 17) = x(p, o)x(p, rT), x(p + 0, rT) = x(0, r)x(e, 7) 


for all p,o, 7 in M. A bi-character x will be called skew-symmetric if x(c¢, 7) 
x(r, 0) = 1 for all o, r in M. An (associative or non-associative) algebra A 
over K is said to be graded by the semi-group M if A is a direct sum of sub- 
spaces A, indexed by p € M such that f € A, and g € A, imply fg © Apu. 

Let L be an algebra graded by M, and let x be a skew-symmetric bi- 
character of M in K. We shall call L a generalized Lie algebra of type x, or 
simply a x-algebra, if f © L,, g € L., imply 


(f, 2] + x(p, o)lg, f] = 0; 


Received March 30, 1959. 

'The referee remarks that the algebras considered in this paper include, as a special case, 
the “‘left Lie algebras” which are used in homological algebra (cf. for example, the exposition 
by P. Cartier in Séminaire Bourbaki, May, 1955). 


493 








494 RIMHAK REE 


Lf, lg, Al] — x(o, o)lg, Lf, A)) = (Lf, gl, 2, 


where [f, g] denotes the product in L of f and g. In case x is trivial, a x-algebra 
is clearly an ordinary Lie algebra. Let A be an associative algebra graded by 
M. Define a new multiplication [a, 6] in the vector space A by 


la, 6] = ab — x(p, o)ba, 


where a € A,, 6 € A,. Then we obtain a new algebra which we shall denote 
by [A]. It can be seen easily that [A] is a x-algebra. 

Let L and L’ be two algebras graded by the same M. A linear map ¢: L — L’ 
will be said to respect grade if f € L, implies ¢(f) € L,’. Let L be a x-algebra 
and A an associated algebra both graded by M. A grade-respecting linear 
map @: L — A will be called a linearization of L in A if ¢ is a homomorphism 
of L into [A], that is, if 


o(Lf, gl) = o(f)o(g) — x(p, o)o(g)o(f) 


for all f € L,, g € L,. The tensor algebra 7 over the vector space L is graded 
by M if T, is defined to be the subspace spanned by elements of the form 
fi: @ fe @®... @ fy, where f; € L,; and pi + po +... + p, = p. Let J be the 
two-sided ideal of 7 generated by homogeneous elements of the form 
f@g—x(p,c)g @f — Lf, gl, wheref € L,.g € L,. Then thealgebra U = T/J 
is also graded by M, and the inclusion map L — T induces a linearization 7 
of L in U. The algebra U will be called the universal enveloping algebra of L; 
it can be characterized by the property: for any linearization ¢: L — A of L 
into an associative algebra A, there exists a grade-respecting homomorphism 


§: U-A such that @ = ton. 


2. Finitely generated free x-algebras. From now on we shall consider 
x-algebras L satisfying the following conditions (2.1) — (2.4): 


(2.1) M isa free abelian group of rank m, with basis elements pj, po, . . . , in; 

(2.2) L, = 0 unless p is of the form p = typ; + tep2 +... + tmpm, where 
ti, to, ...,t_ are non-negative integers not all of which are zero; 

(2.3) each L,, (¢ = 1,2,...,m) is of dimension 1; 


(2.4) L is generated by L,,, L,,,...,2L 


Pm* 


A x-algebra L satisfying (2.1) — (2.4) above, will be called a free x-algebra 
of rank m if any x-algebra satisfying (2.1) — (2.4) is a (grade-respecting) homo- 
morphic image of L. The existence of a free x-algebra can be seen as follows: 
let F be the free (non-associative) algebra generated by an m-dimensional 
vector space E over the field K. If we choose a basis of E over K, then F can 
be graded in an obvious way by the free abelian group M of rank m. Let J 
be the two-sided ideal of F generated by homogeneous elements of the forms 
fe + x(o, o)gf and f(gh) — x(p, o)g(fh) — (fg)h, where f € F,, g © F,. Then 
L = F/J is easily seen to be a free x-algebra of rank m. 














in 











GENERALIZED LIE ELEMENTS 495 


Let U be the universal enveloping algebra of the free x-algebra L of rank 
m with the linearization map 7: L — U, and let A be the free associative 


algebra over K generated by m free generators x, X2,...,Xm. Since L is 
free, there exists a homomorphism ¢: L — [A] such that ¢(f,) = x, i = 1, 
2,...,m, that is, ¢ is a linearization of L in A. Then by the definition of U, 


there exists a grade-respecting homomorphism £: U — A such that @ = £0 7. 
Then £ must be an isomorphism, since A is free-associative. Thus we may regard 
U as a free associative algebra with free generators x; = n(f1),...,%m = 7Ufm)- 
The fact that »(f) = 0 implies F = 0 can also be proved in exactly the same 
way as in the case of free Lie algebras (3, 1-9). Hence we may identify L as 
the subalgebra of [U] generated by x),..., x. It can be seen easily that L 
is spanned by the elements 


[St aXeg o - o Seq) = eal... .eq-.%e)--. |] 

defined in the Introduction by using A(ij) = x(p;, p;). Thus we may state 

THEOREM 2.5. Let K be a field of characteristic zero, x1, X2, ... , Xm NOn-com- 
mutative associative indeterminates over K, and (ij), i,j] = 1,2,...,m, be m* 
elements in K such that d(ij)A(ji) = 1 for all i and j. Then the vector space 
over K spanned by the elements 

Wales « «Bel 

defined above forms a free x-algebra L with respect to the multiplication 


[[xs, ... X4,], [xs . . - Xy)] 


@ 
= [x,,...Xe][xs,...%%] — I] I] A(ipfr) (Xs... « HyqllXay . . - Xe). 


pol vel 


The universal enveloping algebra of L is isomorphic to the free associative algebra 
with m free generators. 


It should be understood in the above theorem that L is graded by M as 


follows: for p = tip; + tep2 +... + tnpm, L, consists of linear combinations 
of elements of the form 


Ce ee 
in which, for each i, x, appears ¢, times. Also, x is defined by x(p;, p;) = A(ij). 
3. A generalization of a Witt formula. Let L be as in Theorem 2.5. 


An element in L will be called a homogeneous element of degree n if it is a linear 
combination of elements of the form 


St Papere oa F 


In this section we shall compute the dimension of the space spanned by all 
homogeneous elements of degree n, following a method given by Witt (4). By 
the same method one may be able to compute the dimension of each L,. 








496 RIMHAK REE 


Let A and B be two associative algebras both graded by M, and A @ B 
the tensor product of A and B regarded as vector spaces over K. Using a 
bi-character x of M, define a multiplication in the vector space A @ B by 


(a @ b)(a’ @ b’) = x(a, p’) (aa’ @ bb’) 


where 6 € B,, a’ € A,’. The algebra obtained in this way is easily seen to be 
associative, and will be denoted simply by A @ B. It will be used in the 
proof of (3.1), below, as well as in the formulation of a generalization of a 
theorem of Friedrichs. 

Now, for the skew-symmetric bi-character x of M, we have x(p, p) = +1 
for any p € M. The subspace L, of the free x-algebra L will be called positive 
or negative according as x(p,p) = 1 or x(p,p) = — 1. Choose a basis for 
each positive L, and let the union of these basis elements be P;, P2, P;.... 
Also, choose a basis for each negative L, and let the union of these basis 
elements be Q:, Qo, Q3.... Let 7: L— U be the linearization of Z into its 
universal enveloping algebra U. Then we have 


THEOREM 3.1. The elements 
n(P1)" 9(P2)” ... n( Px) 9(Q1)" 0(Q2)" . . . n(Qn)™ 


form a basis of the universal enveloping algebra U of the free x-algebra L. Here 
the indices run as follows: s,, 52... are non-negative integers; each of t, is 


either 0 or 1; k,n =0,1,2.... 
Proof. Since, for each i, 
n((Q:, Qil) = n(Q,)? — x(p, o)n(Q,)? = 2n(Q,)’, 


it follows that 7(Q,)? is a linear combination of some n(P,)’s and some (Q,)’s. 
Then by the definition of the linearization, it is clear that U is spanned by the 
given elements. Thus it remains to show that the given elements are linearly 
independent. For this purpose, let U’ be a replica of U with grade-respecting 
isomorphism «: U — U’, and let n’ = to 7. Let U @ U’ be the tensor product 
of U and U’ with respect to x. Then U @ U’ is also graded by M in an obvious 
way, and the map 4:L— U @ U’ defined by 


a(f) =f) @1+1@7'(f) 


is easily seen to be a linearization of L into U @ U’. Therefore there exists 
a homomorphism £: U — U @ U’ such that § on = 4. Using £, one may now 
prove the linear independence of the given elements in exactly the same way 
as in the case of ordinary Lie algebras (3, pp. 1-8). We omit the details. 
Now, let the free x-algebra L given in (2.5) be graded by M as in the 
remark following (2.5). Let the basis elements pj, p2,..., pm be such that 


| ey - 


Pp 
are positive while 











eré 
is 





GENERALIZED LIE ELEMENTS 497 


(p + q = m) are negative. Since, for p = tip: + ... + tndm, 


x(o,e) = T] x(one)"” = T] x(on od = (-1)', 


where ¢ = ty41 +... + tye, it follows that 
[X 44% to eee X tn] 


belongs to a positive L, if and only if its degree with respect to x,4;,..., 4 a 
is even. Denote by p, and g,, respectively, the numbers of P,'s of degree n 
and the numbers of Q,'s of degree m, and consider the formal power series 


F(x,r) = [T] (b+ x8 + 2% 4+... .)4(1 + Act) 
d=! 

with a parameter \. The coefficient c,(A) of x" in F(x) is a polynomial in \ 

with integral coefficients. By (3.1), c,(1) is equal to the dimension of the sub- 

space of U spanned by all homogeneous elements of degree n;c,(1) = (p + q)". 

On the other hand, also by (3.1), c,(— 1) = a, — 5,, where a, denotes the 

dimension of the subspace A, of U spanned by all homogeneous elements 


which are of even degrees with respect to xX,41, ... , Xp4,, and where b, denotes 
the dimension of the subspace B, of U spanned by all homogeneous elements 
which are of odd degrees with respect to x,41,..., Xp+¢- Since U is free associa- 


tive, A, (resp. B,) is spanned by elements 
SesBeg 0 oo Bee 
of even (resp. odd) degree with respect to xXp41,..., Xpyg- Thus 
dy = Crop" + Crop” "g? +..-, 
by = Carp” "'g + Crap” “g? +... 
where C,,, are binomial coefficients. Hence a, — 6, = (p — gq)", and we have 
F(x, 1) = 1+ (6+ @)x + (6+ 9)? +..., 
F(x, 1) = —-1+ (p — g)x + (6 — g)*x?+.... 


Taking logarithms of both sides, and comparing the coefficients of x"/n, we 
have, for » = 1,2,..., 


> dpe — DY (-1)™“dge = (0 + 9", 


din din 
> dpa — p>: dqa = (p — q)". 


Let k > 0 be an odd integer. Then, since 
De (=-1)"*“dge = Do dga — DY 2dqrae, 
d|2%—-1z 


a\ 2% d\k 
Dd dpa = Do dpat Do 2dpons 
a\2%% a|2%—1% a\k 


we obtain, from the above, 











498 RIMHAK REE 


DX 2d(pee + gee) = (p +g)" — (pb — gq)” *. 


a\k 


Then by the Mébius inversion formula, we have 


1 oa-1 
Poe, + Jom = saz i u(d)((p + ,)"™ — (p—q) aan ¥ 
2°*k “at 
In case a = 0, the above reduces (for odd k) to 
1 /a 
hrkt+h = k > u(d)(p + gq)” 
a‘k 
Following Witt, we shall use the notations: 


1 ale 
¥(n) =- p>: u(d)(p + q)”*; 
Y*(n) = Pa + dn 


Then the above can be summarized as 


THEOREM 3.2. The dimension ¥*(n) of the vector space spanned by all elements 
of the form 


16 este - s « Stel 
is given, for odd k, by 
W*(k) = ¥(k); 
V*(2"k) = ¥(2"k) + a DX u(d)((p +9)” "** — (p- @)™ "*), 


dik 
where p denotes the number of indices i such that \(it) = x(pi, ps) = 1 while 
q denotes the number of indices j such that d(jj) = — 1. 


It should be remarked that the function ¥*(m) is completely determined 
by the values of A(iz), and independent of other values of A(ij). The Witt 
formula is obtained as the case g = 0. In case all A(zi) = — 1, we have 
p = 0, and we may deduce from the above that 


mod 4), 


_ fv(n) for n =0,1,3 ( 
n = 2 (mod 4). 


— 
v= \y(n) + ¥(4n) for 

4. An algebra associated with shuffles. We shall generalize the algebra 
defined in (2) to apply to generalized Lie elements. If r and s are positive 
integers, define a shuffle of type (r,s) to be a permutation o of the numbers 
1,2,...,7 +s such that 1 < o(u) < o(v) Cr or r < ao(u) < o(v) C rts 
implies » < v. Take m?* elements \(7j) in K arbitrarily, and define an algebra 
A over K as follows. A has the basis 


8 ee i.e... sean t.s...i 
with the multiplication table: 1 is a unity element; 


(4, .. . 8)O(8r4..- Brae) = 7 X(o)a(46(1)ter2) « - + tetres))s 
o 




















GENERALIZED LIE ELEMENTS 499 


where the sum ranges over all shuffles o of type (r,s) while \(¢) denotes 
the product of all A(is<,), te») such that uw < » and o(u) > o(v). (We set 
A(c) = 1 if o is the identity permutation.) 

Thus, for example, 


a(ija(j) = aaj) + A(jDa(j0); 
a(i)a(jk) = a(ijk) + d(ji)a(jik) + d(ji)d(Ri)a(jki). 


THEOREM 4.1. The algebra A is associative, and if \(ij)A(ji) = 1 for all i 
and j, then it satisfies the generalized commutativity: 


r 


a(j:...j,a(ii...i,) = T] [T] vGj ais... ia(fr...j,). 


p=l vm) 


Proof. lf 
f = a(ty... tr), g = O(Sp4. . . Bree), = B(Spo enn.» » Spots) 


then it is readily seen that both (fg)h and f(gh) are of the form 


> X(c) @(46(1)40(2) see le(r4 s+0))s 


where ¢ runs over all permutations of 1, 2,...,7 + s + ¢ such that any one 
of the three conditions 


1 < o(u) < o(v) <r, 
r<o(u) <o(vy) <crts, 
r+s<a(s) <oclv) <cr+s4+t 


implies » < », and where A(c) denotes the product of all A(t, te<.)) such 
that uw <»v and o(z) > o(v). Hence (fg)h = f(gh). The second half of the 
theorem may be verified easily. 

In the rest of this section, we shall assume that A(ij)A(jt) = 1 for all 
and j. Making the convention that a(i,;...%,) stands for 1 whenever the 
| i,} of indices is empty, we define the bilinear operation V in A by 


a(iy...t,) V afr... je) = alis... ty fr. je). 


We also make the convention that the multiplication in A has priority over 
the operation V. 
Define the elements a[i;i2...1%,] in A inductively by a[i] = a(t) and 
n—1 
aliyi2... in) = a(ix) V afic... in] — [T] ACini)a(i,) V afis.. . ips). 
vel 


For the generalizations in the next section of some theorems on Lie elements, 
we need the following 


THEOREM 4.2. For n > 0, we have 


p alt,... t)a(to41... tn) = ma(t;... 4). 


s=l1 











500 RIMHAK REE 


The above theorem may be proved in exactly the same way as in the 
case where all \(ij) = 1 (2), if we use the linear map D: A — A defined by 
D(1) = 0 and 


Da(iyi2... tn) = V(ti)a(t2... te), 


where y(1),...,y(m) are m arbitrary elements in K. We omit the proof 
of (4.2). Incidentally, the map D becomes an anti-derivation of A if all 
A(w) = — 1. 

THEOREM 4.3. If the linear map ¢:A—A is defined by $(1) = 0 and 
o(a(tyie. . . in)) = altyig...t,], then O(a(i;...4,)a(ipga...tr4s)) = 0 for all 
Se a eee le ee? 


Proof. We shall proceed by induction on nm =r+s. If m = 2, then the 


theorem can be verified easily. Assume m > 2 and that the theorem is proved 
for smaller values of n. By the definition of the multiplication in A, we have 


O(a(i,.. . t,)a(tp41 ~~. tn)) 
= > A(o)a (terry) V [tere . - - tecny] 


n—1 
— 7. (oc) I] A (te(n)tecr) )a (tein) ) V Aliec1 “+. te(n—1)]> 


vel 


where the sums run over all shuffles of type (r,s), 7 + s = nm. Since o(1) = 1 
or r + 1, and o(n) = r or n, the right-hand side of the above equation can 
be written 


> A(o)a(41) V alie2) ee tein] 


o(l)=1 


+ >> Alo)a(irgs) V alier . - - ten] 


o(l)=r+1 


n—1 
— > ro) [] AG )ali,) V alien -- - ie@—y] 


o(n)=T v= 


= > A(c) iT A (intecr) )O (tn) )Vv alte, eee te(n—1)] 


o(n)=—n 


= a(ii;) V O(a(ic...4,)a (ing... . tn)) 


+ [] ACéirsrir)a (ings) V O(a(ir.. . ipa (inne. . « ip) 
veel 


- T] A(i,t,) I] A(t,4,)a(i-) V b(a(ty.. . tp-1)@ (tpn. . . tp)) 
p=r+l vel. rr 
r—1 


- I] A(igtr)A(tg) V b(a(is.. . t,)a (ings. . . d_—1)) 
v=] 




















GENERALIZED LIE ELEMENTS 501 


because of the induction assumption and the fact that, for r = 1, 


Il A(t,t,) Il A(it,) = 1. 


poert+l 


Coro.iary 4.3. If 0 <r <n, then 


NA(t3, ..- 5 tr)O(4y41...%) = _ A(o) (ma (te.1) . . « tetn)) — Alters) . . « tecn])» 


where the sum ranges over all shuffles of type (r,n — r). 


The above corollary, together with (4.2), shows that the (” — 1)m" ele- 
ments @(2; .. . t-)@(tr41... tn), 41,--2,5% @1,2,...,%;0 <7 < #, and the 
m” elements na(i;...%,) — a[t,;...%,] span the same vector space over K. 


Also from (4.2) we obtain 
CoROLLARY 4.4. The linear map oo: A — A defined by $o(1) = 0 and 
bo(a(tyte... tn)) = nm aliyte.. . tal, 


for n > 0, is a projection, that is, oo? = do. 


The following theorem is essentially a generalization of Theorem 2.6 of 
(2), and may be proved by using the map D introduced in the above. 


THEOREM 4.5. For n > 0, we have 


n 
> (-1)' T] Aa, als... i,)a(inip-r . . . inns) = 0. 
=() spn 


5. Generalization of a theorem of Friedrichs. Let L be a free x- 
aigebra of rank m, and n: L — U the linearization of L into its universal 
enveloping algebra. Let U’ be a replica of U with the grade-respecting iso- 
morphism «: UV — U’ and 9’ = ton. Let U @ U’ be the tensor product of 
U and U’ with respect to x. In the course of the proof of (3.1) we have seen 
that the map 4: L — U @ U’ defined by 


a(f) = n(f) @14+1@7'(/) 


is a linearization and that there exists a homomorphism §: UV — U @ U’ such 
that £o7 = 4%. Now the following theorem generalizes a theorem of Fried- 
richs (2). 

THEOREM 5.1. Let , «, and & be as above. Then an element u in U belongs 
to the image n(L) of L under n if and only if 


E(u) = u@l+1 @ elu). 


Proof. The “only if” part follows from the fact that 7 = £o7. In order 
to prove the “if” part, let x1, x2,..., Xm be free generators of U and write, 
for simplicity, x; and x,’ for x; @ 1 and 1 @ «(x,), respectively. If 











502 RIMHAK REE 


with coefficients in K, then 


E(u) = > Oi... tn (Xe + Xin) see (X 4, T x'sn) 


n 
= > L OLS GSs . . . $y )B (Sonn . . . Hy) Ka, - - « VicB trea > > Xtas 
oa 
where ¢ is a linear map: A, — K defined by 
O(a(2, ... te) = ay. .s te 
Hence the condition given in (5.1) is equivalent to 


(a(t, . . . t,)@(te41..-.%)) = O (O<s <n). 


The rest of the proof is exactly the same as in the case A(ij) = 1 (2, p. 214), 
and may be omitted. Here we have to use 


> aliy...ighkts,... Xm = D, alts... tp) [xq ... Xe), 


but this, too, can be proved as in (2, p. 213). 
Similarly we may prove the following 


THEOREM 5.2. A homogeneous element 


I = 7. ai, int it + + + Xin 


in U of degree n > 0 is a generalized Lie element if and only if 


nt = a Ra. alta «««Bak 


This generalizes a theorem of Dynkin-Specht-Wever (2, p. 214). 


REFERENCES 


. C. Chevalley, Fundamental concepts of algebra (New York: Academic Press Inc., 1956). 

Rimhak Ree, Lie elements and an algebra associated with shuffles, Ann. Math., 68 (1958), 
210-220. 

. Seminaire “Sophus Lie,” Theorie des algebres de Lie et topologie des groupes de Lie, 1954-5. 

. Ernst Witt, Treue Darstellung Liescher Ringe, J. Reine Angew. Math., 177 (1937), 152-160. 


a 


- &w 


University of British Columbia 

















MODIFICATIONS AND COBOUNDING MANIFOLDS 
ANDREW H. WALLACE 


Introduction. The object of this paper is to establish a simple connection 
between Thom’s theory of cobounding manifolds and the theory of modifi- 
cations. The former theory is given in detail in (8) and sketched in (3), while 
the latter is worked out in (1). In particular in (1) it is shown that the only 
modifications which can transform one differentiable manifold into another 
are what I call below spherical modifications, which consist in taking out a 
sphere from the given manifold and replacing it by another. The main result 
is that manifolds cobound if and only if each is obtainable from the other 
by a finite sequence of spherical modifications. 

The technique consists in approximating the manifolds by pieces of algebraic 
varieties. Thus if M, and M, form the boundary of M, the last is taken to 
be part of an algebraic variety such that M, and M, are two members of a 
pencil of hyperplane sections. If this pencil is properly chosen it will cut only 
finitely many singular sections on M, each of which will correspond to a 
spherical modification. The converse result is proved by a construction which 
seeks to bring about the situation just sketched. These results are proved in 
the first three sections. 

The situation described here is essentially the same as arises in the study 
of critical values of a function on a manifold. Thus if M is embedded in 
N-space, each modification on the way from M, to Mz, corresponds to a 
critical value of xy. The main result of § 4 is to show that the embedding 
can be done in such a way that, as xy increases from its value on M, to its 
value on M2, the type numbers of these critical points (7, p. 21) do not 
decrease. Whether the theory of critical points could be used more extensively 
in the present connection is not quite clear. One factor arising here (as for 
example in § 5) is that M, and M; are the main objects of interest usually, 
and the M which they cobound may be altered in some way, whereas the 
application of critical point theory would require that M should not be 
changed but should be treated as the underlying space. At any rate so far 
any application of, say, the Morse inequalities (7, p. 85) has yielded only 
trivial results. 

Section 5 shows how the same effect may be brought about sometimes by 
modifications of different types, and the result is applied to give a solution 
of a problem of Bing (2) on the structure of 3-manifolds. 

In § 6 it is shown that any differentiable manifold of dimension not less 


Received January 19, 1959. 











504 ANDREW H. WALLACE 


than 3 cobounds a simply connected manifold, while in §7 a few results are 
given extending this to higher homology and homotopy groups. 


1. Spherical modifications. Throughout this discussion E* and S" will 
denote an n-dimensional cell and an n-dimensional sphere, respectively, sub- 
scripts being used where necessary to distinguish between different copies of 
these sets. 

Let M bea differentiable manifold of dimension m, and let S" be an m-sphere 
homeomorphically and differentiably embedded in M. It is known that a 
sufficiently small neighbourhood B of S” in M can be fibred by (n—m)-dimen- 
sional cells; B is then the normal bundle of S" in M. If B can be expressed 
as the topological product S* K E*-", S” will be said to be directly embedded 
in M. In this case the frontier of B, or what is the same thing, the frontier 
of M—B is of the form S" X S*-"—'. The last product can, however, be 
identified with the frontier of a product of the type E™*' K S*"—". It follows 
at once that the union of M—B and E™*' X S*""—', corresponding points on 
the frontiers of these sets being identified, can be made into a differentiable 
manifold M’. The transition from M to M’ is a modification (1). Modifications 
constructed in this particular way from directly embedded spheres will be 
called spherical modifications. To draw attention to the dimensions involved, 
the modification from M to M’ described above will be called a modification 
of type (m, n—m—1); it can easily be seen that the inverse operation, going 
from M’ to M, is a spherical modification of type (7 — m — 1, m). It will also 
sometimes be convenient to describe the modification from M to M’ as a 
modification which shrinks S” and introduces S*-""". 

It is clear from the above description that the manifold M’ contains a 
directly embedded sphere S*-"—' and that M — S" and M’ — S**~' are, in 
a natural way, homeomorphic. This homeomorphism will be said to be induced 
by the modification. 

Still using the above notations, it is not hard to see that the result of a 
spherical modification does not depend essentially on the way in which B is 
fibred by cells transversal to S”. This follows from the fact that every such 
fibring can be continuously deformed into a canonical fibring by cells made 
up from geodesic arcs normal to S", with respect to some Riemannian metric 
on M. Similarly, isotopic deformations of S” will not affect the modifications. 
On the other hand the mode of expression of B as a product S" X E*", 
equivalent to the choice of a system of cross-sections of B, may be an essential 
factor in determining the result of the modification. Thus it is not in general 
possible to speak of the modification shrinking S" unless reference is also 
made to the way in which B is written as a product. 


2. Cobounding manifolds. A differentiable manifold with boundary is a 
topological space M with a subspace M, such that (1) M, is a differentiable 
manifold; (2) each point of M — M, has a neighbourhood homeomorphic to 











Ht, a ae ome 





re 








MODIFICATIONS AND COBOUNDING MANIFOLDS 505 


an n-cell (m the same for each point); (3) each point of M, has a neighbour- 
hood in M homeomorphic to a solid n-dimensional hemisphere, the base of 
the hemisphere corresponding to the part of the neighbourhood on M;; and (4) 
the transition functions between one neighbourhood and another of the types 
just described are differentiable. When M and M, are related in this way, M, 
will be said to be the boundary of M, and M, will be said to be a bounding 
manifold. In all this there is no need for the manifolds to be connected. Two 
differentiable manifolds M, and Mz will be said to be cobounding if their 
union is a bounding manifold. 

In the case of orientable manifolds the idea of bounding can be made a 
bit stronger. If M, is orientable it will be said to be an oriented bounding 
manifold if it is the boundary of an oriented manifold whose orientation induces 
a preassigned orientation of M,. The set of all orientable manifolds is now 
taken as the set of generators of an additive abelian group. Each connected 
manifold is supposed to be given a preassigned orientation, and the minus 
sign denotes change of this orientation. The manifolds M, and M, are now 
said to be cobounding if M, — Mz is an oriented bounding manifold. 

From the algebraic point of view, the notion of cobounding introduced at 
the beginning of this section can be described as cobounding modulo 2. 

The first main result to be proved is the following connection between the 
ideas of cobounding and of spherical modifications. 


THEOREM 1. Let M, and Mz, be two given compact differentiable manifolds, the 
question of orientation being for the moment ignored. Then M, and Mz are 
cobounding if and only if each can be obtained from the other by a finite sequence 
of spherical modifications. 


Proof. The “if” part of the theorem will be established if it is shown that 
M, and Mz cobound whenever one is obtained from the other by a single 
spherical modification, since the relation of cobounding is transitive. This will 
be proved now as part (a) of the proof, part (b) being the proof of the converse. 

(a) Suppose then that M, is obtained from M; by a spherical modification 
of type (r,2 — r — 1), m being the dimension of the manifolds. Thus there 
are spheres S’ and S"-’-' contained respectively in M, and M2, with normal 
bundles B, = S’ X E*’ and B, = E’+! x S*"-' in these manifolds such 
that M, — B, and Mz — Bz are homeomorphic. Assume now that M, — B, 
and M., — Bz are identified with (M,; — B,;) X {0} and (M, — B,) X {1}, 
respectively, in the set (M, — B,) X I, where J is the unit interval 0 < ¢ < 1. 
Form the union [(M, — B,) X I] U B, U Bo, B,; and B, being inserted where 
they belong in (M, — B,) X {0} and (M, — B,) X {1} according to the 
identification just made. The subset B, \V B, U (FrB, X J) in the space so 
constructed is an m-sphere and so can be identified with the boundary of an 
(n + 1)-cell E**'. Adding E"*' to [(M, — B,) X I] U B, U Bz with suitable 
identifications on the boundaries, an (m + 1)-dimensional manifold M is 
obtained, and can easily be adjusted along the boundary of E"*' so as to be 











506 ANDREW H. WALLACE 


differentiable. Moreover, it is clear that the boundary of M is the union of 
M, and M;. Thus M, and M; are cobounding manifolds as was to be shown. 

(b) The idea of the converse is as follows. Suppose M is a differentiable 
manifold with boundary, the boundary being the union of M,; and Mz. It is 
to be shown that M; can be obtained from M, by a finite sequence of spherical 
modifications. To show this, M is first to be approximated by part of a real 
algebraic variety in N-space in such a way that M, and M; are parts of the 
sections by the hyperplanes xy = 0 and xy = 1, respectively. This can be 
done in such a way that the family of hyperplanes xy = c, for 0 < ¢ < 1, 
cuts the approximation of M in non-singular sections with just a finite number 
of exceptions, on each of which there is exactly one singular point at which 
the tangent cone is a non-degenerate quadric cone. Then it will be shown 
that the transition from one side of a singular section to the other is locally 
the same as the transition from negative to positive values of ¢ in a family 
of quadrics 


Xu ag’ =t 


in m-space, and hence it will be verified that each such transition is carried 
out by means of a spherical modification. 

The details of the proof just sketched will now be worked out. In the first 
place M is to be embedded in a Euclidean N-space Ey, which can be done 
if N is large enough. Also it is clear that the embedding can be done in such 
a way that M, and M; lie in the hyperplanes xy = 0 and xy = 1, respectively, 
while the rest of M lies entirely between these hyperplanes. The algebraic 
approximation mentioned above could be made already at this stage, but to 
ensure that the approximating variety will have no points near M except 
those which are actually approximating points of M it is convenient to carry 
out the following additional construction. Take second copies in Ey of M, 
M,, Mz, respectively, namely M’, M,', M;’, and suppose that M,’ and M,’ 
lie in the hyperplanes xy = 0 and xy = 1, respectively, and that the rest of 
M’ lies between these hyperplanes; also assume that M(\ M’ = @. M’ can 
be constructed in this way by a translation in Ey for example. In addition 
M and M’ can be adjusted so that they cut the hyerplanes xy = 0 and 
xy = 1 orthogonally. By adding to MU M’ sets homeomorphic to M, X I 
and M;, X I, lying in the parts of Ey where xy < 0 and xy > 1, respectively, 
a compact differentiable manifold M” can be constructed. M” has the property 
that there is a neighbourhood U of M in Ey such that U (\ M” is homeo- 
morphic to M; in fact it is equal to M with, so to speak, a narrow fringe 
added along M, and M2. Now it is known (4; 9) that there is a real algebraic 
variety V in Ey with an isolated sheet approximating M” arbitrarily closely. 
This approximation is not only in the pointwise sense, but also the tangent 
linear varieties at corresponding points of M”’ and V approximate one another 
arbitrarily closely (4; 9). In particular it follows that M itself is approximated 























MODIFICATIONS AND COBOUNDING MANIFOLDS 507 


arbitrarily closely by the part of V (\ U which lies between xy = 0 and 
xy = 1, while M, and M; are approximated by the intersections of V (\ U 
with these hyperplanes. 

At this stage it is convenient to make a change of notation, simply replacing 
M by its approximation. Thus from now on in this proof it will be assumed 
that M lies on a real algebraic variety V in Ey and that there is a neighbour- 
hood U of M such that M is the part of V (\ U lying between the hyperplanes 
xy = 0 and xy = 1, while M,; and M; are the intersections of these hyper- 
planes with M. 

Some properties of an algebraic variety in relation to a pencil of hyperplane 
sections are now to be applied to the present situation. In the first place, if 
V is an algebraic variety in real projective space and II is a generic hyper- 
plane pencil only a finite number of members of II will contain the tangent 
linear variety at some simple point of V, and each of these will contain the 
tangent linear variety at exactly one point of V. In addition, each of these 
finitely many points of contact for members of II is a generic point of V. This 
can all be proved as in (10, ch. 1). The fact that V may not be non-singular 
makes no essential difference to the technique of the dual variety used there. 
Now choose homogeneous co-ordinates (x,, X2,...,%w,Xwi41) in the space 
containing V such that xy,; = 1 and the equations of the members of II are 
of the form xy = constant, and also such that, if V is of dimension m, the 
projection of V into the linear subspace Xm41 = %mi2 =... = Xy-1 = O is 
one-one around a generic point. When this is done the equations of V (in 
affine form) will be 


F(x, Xo, ..., Xm, %n) = 0 ) 
(1) ‘ Aes ep 
%_ = Relx, Xa, « - « » Seay Sm) 
where i= m+1,m+2,...,N-—1, F being a polynomial and the R;, 


rational functions with coefficients which are real when V is a real variety. 
Also, making a shift of origin to one of the points at which a member of II 
contains the tangent linear space to V, and remembering that such a point is 
generic on V over the real numbers it turns out that equations (1) can be 
written in the form 


tn = Bip Be, 0 0 0 yg Sm 
(2) oe : 
Te £i(%1, > Sr) Xm) 
where 1 = m+1,m+2,...,N—1, and the functions f and the g, are 


real analytic in a sufficiently small neighbourhood of the origin. Also, since 
the new origin started off as a generic point of V the power series expansion 
for f around that point is of the form 


m 


(3) f= >> ayxea,t... 


i, j=l 


where the dots denote terms of order greater than two and the determinant 











508 ANDREW H. WALLACE 


|a,,| is not zero. The linear terms are of course zero because the tangent linear 
variety to V at the origin is contained in xy = 0. 

Now in what has just been said the pencil II is generic, that is to say, 
the coefficients of the linear equations defining the axis of II are indeterminates 
over the real numbers. The conditions that the choice of II and of co-ordinates 
as above should not give equations for V of the type (2) and (3) at each of 
the points where a member of II contains the tangent linear variety is algebraic 
in these indeterminates. It follows that the coefficients of the equations of the 
axis of II can be given real values in such a way that the equations of V can 
be brought into the form described above. A final point is that, since the 
pencils which are unfavourable lie in an algebraic family, then whatever 
co-ordinate system is given in the space containing V, a linear change of 
co-ordinates with a matrix whose elements are arbitrarily close to those of the 
identity matrix will yield a co-ordinate system in which the equations of V 
can be written in the manner just described. 

The discussion just carried out is now to be applied to the variety V of 
dimension m + 1 introduced in the earlier part of this proof, namely the real 
variety containing the manifold with boundary M whose sections with xy = 0 
and xy = | are the manifolds M, and Mz respectively. Then a small dis- 
placement of the given co-ordinate system will give a system with the following 
properties. There is a neighbourhood U of M such that the intersection of 
U (\ Vwithxy = cis non-singular for all except a finite set of values c;, c2,..., 
c, of c; for each 1, xy = c intersects U(\ V in a section with exactly one 
singular point, say P,; if P; is taken as origin the equations of V can be 
written in the form (2) and (3) around P;. Since V was approximately ortho- 
gonal to xy = 0 and xy = 1 at points of M, and M; in terms of the original 
co-ordinates, and since the displacement of co-ordinates is supposed to be 
small, it follows that the intersections of xy = 0 and xy = 1 with U()\ V in 
the new co-ordinates are respectively homeomorphic to M, and M:2. Again 
it is convenient to change the notation and simply to say that these inter- 
sections are M,; and M2. 

To complete the proof of the theorem it will be shown that the transition 
from the intersection of U(\ V with xy = c¢ — e€ to its intersection with 
Xv = ¢ + €, for some small positive « can be made by means of a spherical 
modification. To do this fix attention on one of the P, and take it as origin. 
Then in a neighbourhood of the origin V will have equations of the type (2), 
with f of the form (3). With this new arrangement of the co-ordinates the 
section M(c) of M by the hyperplane xy = c, for sufficiently small c, will 
have equations in a neighbourhood of the origin of the form 


(4) LD apes t+ o=c 
where ¢ is a power series in the variables x;, x2, . . . , X,.41 of order not less than 
three, along with further equations which express x,+42, X,43,...,% Cy—1 as 


analytic functions of x, X2,...,%Xn41. By a linear change of the variables 








ye we 








MODIFICATIONS AND COBOUNDING MANIFOLDS 509 


Xi, X2,...+ )%m41 the quadratic terms in (4) can be diagonalized. Assuming 
that this has been done, (4) will be of the form 

n+l 
(5) Xu agit o=c. 


Since ¢ contains only terms of degree greater than two, a theorem of Samuel 
(5) shows that, for sufficiently small values of the variables, an analytic 
change of co-ordinates from x, X2,..., X,41 to a new set V1, V¥2,..., Ves Can 
be made by formulae of the type x; = y,; + A,(y), where the A; are power 
series of order not less than two, in such a way that 


> exit+e= Dd agi. 


By orthogonal projection from (x, X2,...,%w)-space into (x4, X2,...,Xn+1)- 
space followed by a change to the y-co-ordinates it is then clear that a neigh- 
bourhood of the origin on V, that is to say on M, can be mapped analytically 
and homeomorphically on a neighbourhood of the origin in (1, Ve, .. . , Ya+1)- 
space, and the parts of the M(c) near the origin in N-space will be mapped 
into the family of quadrics Q(c), or at least the parts of these quadrics near 


the origin, in (1, Ye, ..., Ya+1)-Space, where Q(c) has the equation 
n+l . 
(6) > ayi=c. 


Now it can be explicitly verified that if r+ 1 of the a, in (6) are positive 
and the rest negative (none are zero) and if co is positive then the transition 
from Q(co) to Q(— co) can be made by a spherical modification of type 
(r,m — r + 1). In addition the homeomorphism induced by this modification 
can be constructed in a particular way. Namely, if small neighbourhoods, 
more precisely normal bundles, of the spheres 


r+1 


dX agi = co, y, = 0 (Gj>r+2) 


i=l 


on Q(co) and 


n+1 


> agi = —co, ¥; = 0 <r+1) 


i=r+2 


on Q(— co) are removed (here it is assumed that aj, do,...,@,4: are the 
positive a,) then the corresponding points on the remaining sets of Q(co) and 
Q(— co) are joined to each other by members of the family F of orthogonal 
trajectories to the family of Q(c). 

Returning to the variety V and more specifically to M, it has already 
been seen that 91, Yo,..., Yeti can be taken as a set of local analytic co- 
ordinates on M around the origin. Also the ordinary Euclidean metric in 
(v1, Y2, - - + » Yn+1)-Space induces a Riemannian metric on M in a neighbourhood 
of the origin. By means of a partition of unity a Riemannian metric can be 
set up on the whole of M so as to agree with this induced metric in a sufficiently 











510 ANDREW H. WALLACE 


small neighbourhood of the origin on M. Then the image on M of the family 
F of orthogonal trajectories to the Q(c) can be extended to the family F’ of 
orthogonal trajectories to the family of sections M(c) of M, at least in a 
neighbourhood of M(0). It is thus clear that, for co sufficiently small and 
positive, if the images on M of the spheres on Q(co) and Q(— co) mentioned 
above are removed, then the remaining sets on M(co) and M(— co) are 
homeomorphic, corresponding points being joined by members of the family 
F’. Apart from this the spherical modification carrying Q(co) into Q(— cy»), in 
so far as it affects points near the origin, is carried into a similar modification 
taking M(co) into M(— co). And this completes the proof of the theorem. 

It is possible to give part (a) of the above theorem a more precise form. 
Namely, if Mz is obtained from M, by a single spherical modification, then 
the manifold M can be constructed in such a way that M,; and M, belong 
to a pencil of hyperplane sections of M containing exactly one singular section. 
In other words the given modification can be made to arise in the same way 
as the modifications shown to exist in part (b) of the theorem. To prove this, 
the cell E**' which appeared in the course of the proof of part (a) must be 
constructed in a special way. For values of ¢ such that — 1 < ¢t < 1, let Q(?) 
be the quadric hypersurface x;? + xo? + ...4+ x47 —X422 —...—Xy42=t 
in (m + 1)-space. The section of Q(1) by the linear space x,» 


™ 243 ™ 2... 
= Xn41 = 0 is an r-sphere S’ whose normal bundle of some convenient radius 
in Q(1) is a set B,’ homeomorphic to S’ X E*-’, and so to B, (in the notation 
of part (a) of the above theorem). Construct the family of orthogonal trajec- 
tories F to the family Q(t). Then the set of points on curves of F meeting 
Q(1) at points of B,’ is an (m + 1)-cell E’*”. It is clear that, apart from 
the curves of F starting at points of S’, all of which end at the origin, all 
the members of F starting at points of B,’ reach Q(—1) at points in the normal 
bundle B,’ of the sphere S*-’"—' in which Q(— 1) is cut by the linear space 
x, = Xo =... = X,4; = 0, and similarly the other way round. B,’ is homeo- 
morphic to S"-’-' K E’*', that is to say, to B:. Now, referring to the proof 
of part (a) of the above theorem, it will be seen that the frontier of E"*' first 
appeared as the frontier of (M, — B,) X J with the sets B,; and By, added 
in the appropriate way, M, and M; being identified with (M, — B,) & {0} UB; 
and (M, — B,) X {1} U Bs, respectively. The frontiers of E**' and E’*! 
are now to be identified. To do this define a mapping f of the frontier of 
E’“*» onto that of E"*' as follows: first f is to be defined as a homeomorphism 
of B,’ onto B, preserving the product structure. Then if (p, ¢) is the point 


of parameter ¢ (that is, the point lying on Q(t)) on the curve of the family F 


which passes through p on B,’, f(p, t) will be defined as the point (f(p), 4 — 42) 
in FrB, X I (this makes sense as f(p) is already defined). In particular f is 
now defined as a homeomorphism of FrB,’ onto FrB2, preserving the product 
structure, and so it can be extended over the whole of B,’, carrying this set 
homeomorphically onto By». f is now defined on the whole of FrE’*", and 
so can be extended to a homeomorphism of E’“+” onto E+», 














MODIFICATIONS AND COBOUNDING MANIFOLDS 511 


Using the mapping f just defined, the family M(t) of sets will now be 
defined. For each ¢ such that — 1 < ¢ < 1 set 


M(t) = f(Q() CO BE’) U (M, — B,) X {fs} 


where s = 4 — $t. Then, for each ¢t # 0, M(t) is a manifold, and M(0) has 
a single isolated singular point corresponding to the vertex of the cone Q(0). 
In particular M(1) = M, and M(— 1) = M3. 

If the family M(t) is in (x1, x2,...,Xy-1)-space, then the manifold M of 
part (a) of Theorem | can be constructed in (x1, x2, ..., Xy)-space by taking 
it as the set whose intersection with xy = ¢ is M(t). As the construction has 
been done here, M and the M(t) may not be differentiable, but they can clearly 
be arranged to be so by taking suitable precautions when the boundary of 
E**' and that of (M, — B,) U B, U Bare identified, and when the mapping 


f is extended into the interior of E’*”. 


A further point to notice is the existence of a family F of curves on M 
consisting of the image under f of the orthogonal trajectories to the Q(t) 
lying in E’“*” along with all the curves of the form {|p} X J for pin M, — By. 
These curves have the following properties: 

(1) Exactly one of them passes through each point of M different from P, 
the image under f of the origin in (xy, Xe, .. . , X,41)-space. 

(2) The curves starting on S’ in M, all end at P; so also do those which 
start at points of S"-’-' in Mo. 

(3) The set of points on the members of F starting on S’ is an (r + 1)-cell 
E’+' in M. Thus E£’*! is an (r + 1)-cell in M with boundary S’ on M,. Simi- 
larly there is an (m — r)-cell E”~’ in M with boundary S"*-’—' on Ms. 

Suppose that, in addition to the modification @ carrying M, into M., a 
second modification ¢’ is applied to M2, taking it into M;, and suppose that 
a manifold M’ having M, and M; as its boundary and containing a family F’ 
of curves with properties similar to (1), (2), and (3) above has been con- 
structed in the manner just described for M and F. Then M and M’ can 
be joined together along M2, and if suitable precautions are taken the result 
will be a differentiable manifold. Also the families F and F’ can be combined, 
each curve of F being joined to the curve of F’ starting at its end point on 
M.. Now it has been remarked that a displacement of the sphere shrunk in 
a modification does not affect the result, and so if ¢’ is of type (s,m — s — 1) 
with s < r it can always be arranged that the S* shrunk by ¢’ does not meet 
the S"-’-' introduced by ¢. It follows that the curves of F’ starting on S*-""! 
can be added to E*~’ to give a larger (n — r)-cell in MU M’ with its boundary 
in M;. A similar remark can be made concerning any sequence of modifications 
of suitable types. 

It should be remarked here that, in the proof of part (b) of the above 
theorem, there is an extreme case which may occur, corresponding fo the 
values — 1 or n for r. This arises when a section xy = c of M has a singularity 
which is an isolated point. Although, strictly speaking this should be allowed as 











512 ANDREW H. WALLACE 


a modification with the appropriate alteration to the statement of Theorem 1, 
it will turn out (cf. § 4, Theorem 4) that these extreme cases can be avoided 
by suitably transforming the manifold M. 


3. The oriented case. For the present purpose the most convenient way 
of fixing the orientation of a connected orientable differentiable manifold is 
by means of sets of local co-ordinates. Namely, having fixed a co-ordinate 
system in a neighbourhood U, a second system in L’ will be called positively 
or negatively oriented according as the Jacobian of the co-ordinate trans- 
formation is positive or negative. For a connected orientable manifold there 
is a covering by co-ordinate neighbourhoods with co-ordinates chosen so 
that, in the overlap of any two of the neighbourhoods the Jacobian of the 
corresponding co-ordinate transformation is positive. If the restriction to U 
of any one of these co-ordinate systems is positively oriented then the whole 
collection of local co-ordinate systems defines on the manifold the orientation 
induced by the fixed system in U. 

The following lemmas prepare the way for the main result of this section. 


LEMMA 3.1. Let M be a connected orientable differentiable manifold in Euclidean 
N-space, and let H be a hyperplane such that H (\ M 1s a connected differentiable 
manifold. Then H (\ M is orientable. 


Proof. Local co-ordinates can be taken on M in a neighbourhood U of a 
point of H (\ M in such a way that, if the Euclidean co-ordinates have been 
arranged so that H has the equation xy = 0, then xy is one of the local 
co-ordinates. It is clear then that xy can be included among the local co- 
ordinates around every point of H (\ M, and so the orientation induced on M 
by the selected co-ordinate system in U automatically defines an orientation 
on Hf\ M, which is therefore orientable. 


CoroLLary. Jf M is an orientable differentiable manifold with a connected 
boundary which is also a differentiable manifold, then this boundary is also 
orientable. 


Proof. For the given manifold can be so arranged that the boundary is a 
hyperplane section. 

In the above lemma it should be noted (and this observation also applies 
to the corollary) that, if H /\ M is not connected, orientability holds for 
each of the connected components separately. 


LemMA 3.2. Let M and M"’ be connected orientable differentiable manifolds 
having a common boundary which is a connected differentiable manifold M,. 
Then MV) M’ is orientable. 


Proof. Embed M and M’ in N-space so that M is in the set xy < 0 and M’ 
in the set xy > 0, M, thus being the section of MU M’ by xy = 0. It is 
then easy to see that local co-ordinates in MV M’ can be chosen around 








ind 








MODIFICATIONS AND COBOUNDING MANIFOLDS 513 


each point of M, so that xy is always included as one of the co-ordinates, 
while the rest of MU M’ can be covered by co-ordinate neighbourhoods in 
M and M’ separately. Since M and M’ are orientable and M, is connected it 
follows at once that the co-ordinates can. be chosen in each of these neigh- 
bourhoods so that an orientation is defined on MU M’ as required. 


LEMMA 3.3. Let M, be a connected orientable differentiable n-manifold, and 
let Mz be obtained from M, by a spherical modification of type (r,n — r — 1) 
with r not equal to 0 or n — 1. Then Mz is orientable. 


Proof. Suppose that the modification in question shrinks the sphere S’ with 
normal bundle B, in M,;. Then M, — B, isan oriented manifold with a con- 
nected boundary. Also B, (the set to be added to M, — B, in the modification) 
is oriented with the same connected boundary. Then by Lemma 3.2 
M, = (M, — B,) U Bs; is orientable. 

The condition on r in the last lemma cannot be dropped. For it is possible 
for a (0, — 1)- or (m — 1,0)-modification to change the orientability or 
otherwise of a manifold, as, for example, in the case of a (0,1)-modification 
applied to the surface of a sphere to make it into a Klein surface. Of course 
there are two ways in which a (0, 1)-modification can be applied to a sphere, 
the one giving a torus and the other a Klein surface. A similar situation holds 
in general. For the effect of a (0, — 1)-modification on a manifold M, is 
to remove two disjoint n-cells from M, (namely the normal bundle of the 
S® to be shrunk) and to identify the points of the two (m — 1)-spheres which 
are their boundaries. Clearly there are essentially two different ways of making 
this identification, and if M, is orientable one of these ways will give an 
orientable M, and the other a non-orientable one. If the (0, 2 — 1)-modifica- 
tion carries an orientable manifold into another orientable manifold, then 
the modification itself will be said to be orientable. 

The following theorem now gives the necessary complement to Theorem 1 
for the case of orientable manifolds. 


THEOREM 2. Let M,; and Mz be two orientable differentiable manifolds. Then, 
with suitable orientations of their connected components, they cobound in the 
oriented sense if and only if they are related by a finite sequence of spherical 
modifications of which each modification of type (0,n — 1) or (n — 1,0) is 
orientable. 


Proof. If M,; and M, cobound in the oriented sense, then, by definition, 
their union constitutes the boundary of an orientable manifold M, and the 
orientations of the various components of M, and Mz, are supposed to be 
those induced by some selected orientation of M. As in Theorem 1, M is to 
be taken as part of a real algebraic variety in N-space such that M, and M, 
are the sections of M by the hyperplanes xy = 0 and xy = 1, while the rest 
of M lies between these hyperplanes. Also just a finite number of the hyper- 
planes xy = c are to cut M in singular sections, each with exactly one singular 











514 ANDREW H. WALLACE 


point as in Theorem 1. By Lemma 3.1 and the remark following it, each 
hyperplane xy = c, except those cutting singular sections, cuts M in a differ- 
entiable manifold whose components are orientable,with orientations induced 
by that of M. It follows at once, by considering sections on either side of a 
singular section corresponding to a (0, — 1)- or (nm — 1, 0)-modification 
that each such modification must be orientable (noting that this terminology 
makes sense whether the modification affects one component only or has the 
effect of joining two components together, for these components all have well 
defined orientations). This completes the proof in one direction. 

To prove the converse, let M, be obtained from M, by a sequence of 
spherical modifications in which each of type (0,” — 1) or (m — 1,0) is 
orientable. Here it is assumed that the components of M, are given preassigned 
orientations. Then Lemma 3.3 along with the assumed orientability of the 
(0, m — 1)- and (m — 1, 0)-modifications ensures that, as each modification is 
performed, the result is orientable with a naturally induced orientation on 
each component. The final result is supposed to be M2 with suitable orienta- 
tions on its components. The object now is to show that M, constructed as 
in Theorem 1, part (a), is orientable, and that it can be oriented in such a 
way that the correct orientations are induced on the components of M, and 
M:;. Clearly it is sufficient to carry out the proof in the case where M, and M, 
are related by one spherical modification. 

Consider then the construction of M in the proof of Theorem 1, part (a). If 
the modification in question is of type (r,m — r — 1) with r not 0 or — 1 
it may as well be assumed that M, is connected, since such a modification 
will affect just one component. Then, in the notation of part (a) of Theorem 1, 
(M, — B,) X [ is orientable and it is not hard to see that its frontier along 
with B, and B, will make up an oriented S", the orientation induced by that 
of (M, — B,) X I. It follows at once that when the cell E"*'! is added to 
form M the latter will be orientable and its orientation will induce that of 
M, and Mz. In the case of a (0, m — 1)-modification, assumed orientable, 
this assumption turns out to be exactly what is wanted to ensure that 
(FrB, X I) U B, U B, will be an oriented n-sphere, M, and M, having been 
suitably oriented. Then as before the addition of an (m + 1)-cell gives an 
orientable manifold as required. 


4. Rearrangement of modifications. In general there is no guarantee 
that the members of a sequence of modifications can be commuted among 
themselves, for the spheres introduced by the earlier modifications may inter- 
sect those to be shrunk in the later ones and it may be impossible to disentangle 
them. There are, however, certain ways in which the order of a sequence of 
modifications can be changed, and these will be examined in this section. 


THEOREM 3. Let M, and Mz be n-dimensional differential manifolds related 
by a sequence of spherical modifications of types (n — p — 1, p) for various 
values of p not less than r. Then the order of these modifications can be changed 




















MODIFICATIONS AND COBOUNDING MANIFOLDS 515 


in such a way that all the (n — r — 1, r)-modifications are done last (M, being 
counted as the initial state). 


Proof. The assumption on p is vacuous if r is zero, but otherwise the proof 
in this case is the same. The situation of § 2 will be assumed to hold here, in 
particular as described in the remarks at the end of the section following the 
proof of Theorem 1. Namely, M, and M, will be assumed to form the boundary 
of a differentiable manifold M in N-space, and in fact to be the sections of M 
by the hyperplanes xy = 0 and 1, the rest of M lying between these. And 
among the sections of M by the family xy = c there are to be finitely many 
with a singularity, each corresponding to a spherical modification. It will 
also be assumed for the moment that none of these singular sections has an 
isolated point corresponding to an m-sphere which shrinks to a point and 
vanishes as the section xy = c varies from c = 0 to c = 1. This restriction 
will be removed later (cf. Theorem 4). The section of M by xy = ¢ is to be 
denoted by M(#), and as in § 2 there is to be a family F of curves in M cutting 
across the non-singular M(t) transversally. 

Starting from M, let @ be the first modification of type (m — r — 1,7), 
corresponding to a section M(c) of M with a singularity at the point P. Then, 
as remarked in § 2, it can be assumed that the spheres shrunk in later modifi- 
cations do not meet the members of F which meet the r-sphere introduced 
by ¢, since all other modifications are of type (nm — p — 1, p) with p > r. The 
points of all the curves of F starting at P and lying in the part of M for which 
xy > c form an (r + 1)-cell E’*' with boundary S’ contained in M2. The idea 
of this proof is to deform the family M(t) in a neighbourhood of E’*', so 
obtaining a new family of submanifolds, some with singularities. M is then 
to be deformed so that this new family becomes a pencil of hyperplane sections, 
a finite number being singular. These singular sections will correspond to a 
sequence of spherical modifications leading from M, to Mo, and it will turn 
out that the modifications are all the same as those in the given sequence, 
but that @ now appears last. 

The details of the idea just sketched will now be filled in. There is a neigh- 
bourhood U of P on M which is the homeomorphic image, under a mapping f, 
of a neighbourhood of the origin on the quadric Q in (m + 2)-space with the 
equation 


z= yityet... + ye — Yue —--s — Veer 


By means of this mapping the section M(c + ¢) of M is locally identified 
with the section Q(t) of Q given by z = ¢ (cf. the end of § 2, with the appro- 
priate changes of notation). Also under this homeomorphism f the sphere 
introduced by the modification ¢ is the image of the sphere on Q given by 
Vi + ye? +... + ra? = 2, Vrae = 0,...,9%n41 = 0, for some sufficiently 
small z > 0, and the family F restricted to U is the image of the family of 
orthogonal trajectories to the Q(t) in a neighbourhood of the origin. 











516 ANDREW H. WALLACE 


The next step is to construct a neighbourhood in M of the set E’*' in a 
rather special way. First, in the neighbourhood U, take the smaller neigh- 
bourhood f(Uo), image under f of the set in Q defined by the inequalities 


ls] < €,yrne + yrs t+... + Vers <8 


for sufficiently small positive « and 6. It is not hard to see that U» is an 
(m + 1)-cell with boundary consisting of the following three sets: 


(1) The part of z = — ¢e on Q such that 


n+l 


r+2 


(2) The set |z| < « satisfying 


n+1 
yi = 
This is homeomorphic to S’ X S*-""! X I. 
(3) The part of z = « on Q with 
n+1 m 
DL vi <5. 


This is homeomorphic to S’ X E*~’. 

The image of the set (3) under f is a neighbourhood By of So’ in M(c + 6), 
So” being the sphere introduced by ¢. If By is small enough all the curves of 
F meeting it can be continued up to M;; let B;"be the set of points on all 
these curves. Then define B as the union of B, and f(Ue). Clearly B is a 
neighbourhood in M of E’*! and is an (m + 1)-cell with boundary consisting 
of the sets: 

(1)’ The image under f of the set (1) above. 

(2)’ The union of the image under f of (2) above with the set of points on 
curves of F meeting FrBy on M(c + e). 

(3)’ BO\ M. 

Note that the set (2)’, like (2), is homeomorphic to S’ X S*-’-! « IJ. In 
(2) I is identified with the interval |z| < «, and in (2)’ with the interval 
c—e<t< 1, ¢ being the parameter specifying the sections M(t). 

The switching of the order of modifications so that @ comes last is carried 
out by constructing a new mapping g of U» in Q into M, this time mapping 
it onto the whole of B. This mapping will be defined by identifying the sets 
(1), (2), and (3) on the frontier of U with the sets (1)’, (2)’, and (3)’ on the 
frontier of B, and then extending into the interiors of these sets. 

The mapping g : FrU,) — FrB is defined as follows: 

(a) The restriction of g to the set (1) is to coincide with f. 

(b) g is to map (2) onto (2)’. It has been noted that both sets are homeo- 
morphic, and in a natural way, to S’ X S"-’-' x I. g will be defined by giving 

















an 





MODIFICATIONS AND COBOUNDING MANIFOLDS 517 


a homeomorphism h of the interval J in (2), namely — « < z < «, and the 
interval IJ in (2)’, namely c — « < ¢ < 1. ht is to be defined in such a way 
that the interval — « < z < 0 is mapped on the interval c — « Ct < 1 — 9, 
where 7 is chosen so that all the sections M(t) with ¢ > 1 — 9 are homeo- 
morphic to M>. Apart from this condition h# can be arbitrary. 

(c) g as defined in (a) and (b) is to be extended in the obvious way to map 
the set (3) on the set (3)’. 

Finally, since Up and B are (m + 1)-cells, g can be extended into the interior 
of U» to give a homeomorphism of U> onto B. 

To define a new sequence of modifications relating M, and Mz, construct 
a family M’(t) of subsets of M as follows: 

Fort <c—e, M’(t) = M(?). 

For c—¢«<t<1-—n, M’(t) is the union of the part of M(¢) outside B 
with g(Q(h-"(t)) C\ Up). 

For i>1-—n, M’(t) = M(?). 

The M’(t) as so defined may not be differentiable but can be made so (apart 
from a finite number each of which will have one singularity) by a suitable 
adjustment, or by a suitable definition of g in the first place. Define M’ to 
be the set in (x4, Xe, ... , Xy41)-space such that M(?) is the section by xy4; = ¢t 
(M of course is supposed to be in (x, X2,...,Xy)-space). In particular 
M’'(0) = M, and M’(1) = Ms, and so M,; and M; cobound the new manifold 
M’, which, incidentally, is clearly homeomorphic to M. 

Consider now the set of modifications corresponding to the singular members 
of the family M’(t). For ¢ < 1 — 9, the only singular M’(t)s are those corre- 
sponding to all the original modifications relating M, and Mz, except ¢. 
M’(1 — ») is a singular section of M’ corresponding to a (m — r — 1, r)-modi- 
fication ¢’. And there are no further modifications. 

¢’ can be thought of as the modification ¢ shifted to the end of the sequence 
of modifications. To complete the proof of the theorem, each (m — r — 1, r)- 
modification is to be shifted to the end in this way, and this can be done in 
a finite number of steps as above. 

There are a number of remarks and corollaries connected with the theorem 
just proved. In the first place it must be emphasized that M’, as constructed 
in the course of the proof, is homeomorphic to M; this point is of importance 
in certain applications where the main object of interest is not the pair of 
manifolds M, and M; but the manifold M which they bound. Another point is 
that @ was taken as the first (n — r — 1, r)-modification starting from M,. 
It is quite clear however that M, and M, could be replaced by two inter- 
mediate sections M,’ and M,’ of M, when the same method of proof would 
show that any (m — r — 1, r)-modification can be moved to any later stage 
in the sequence of modifications leading from M, to Mz. 

An essential result which must now be obtained is the possibility of removing 
the restriction imposed in Theorem 1, that no section of M by a hyperplane 
*%y = ¢ should have an isolated point. 











518 ANDREW H. WALLACE 


THEOREM 4. Let M, and M, cobound M, these manifolds being arranged as 
in Theorem 1 in Euclidean N-space, singular sections by hyperplanes xy = ¢ 
corresponding to spherical modifications leading from M, to M2. Then the em- 
bedding of M can be done in such a way that no section by a hyperplane xy = c 
has an isolated point. 


Proof. Proceeding from M, to M; let M(c) (notation of Theorem 1) be the 
last section of M with an isolated point P corresponding to a vanishing 
sphere. That is to say M(c) has the isolated point P and for small « M(c — e) 
has a small isolated sphere near P, while M(c + €) has no points near P. 
Varying ¢ from c downwards, the n-sphere introduced at P becomes joined to 
some other component of a section of M by a (0, m — 1)-modification (possibly 
after some modifications have been applied to the sphere itself). Let @ be the 
inverse of this (0, m — 1)-modification, corresponding to a singular section 
M(c’) of M, and then, for a sufficiently small ¢«, apply Theorem 3 to the part 
of M between M(c’ — e) and M(c — e). The result is that it can be assumed 
that, in the sequence of modifications leading from M, to M2, the last modifi- 
cation before the vanishing of the m-sphere at P is an (m — 1, 0)-modification 
which isolates that sphere. This modification will still be called ¢, and the 
corresponding singular section of M will be M/(c’). 

For ¢ near c’ but less than it, M(t) contains an (m — 1)-sphere S*-'(t) which 
is to be shrunk by the modification ¢. The part of M(t) on one side of S"~'(t) 
is an n-cell E*(t). As ¢ tends to c’, E*(t) closes up to form an n-sphere, and 
M(t), for c’ < t < c, contains this detached sphere S"(t) which shrinks to a 
point as ¢ tends to c. It is clear that, for a sufficiently small positive e, the union 
of all the E*(t) for c —e<t<c’ and all the S*(t) for ¢ <t<c is an 
(m + 1)-cell E**', having on its boundary the n-cell E" formed by the union 
of all the S*"'(t) for c —e <t<c’ (S*"'(t) reduces to a point for ¢t = c’). 
E"*' is homeomorphic to a solid (m + 1)-dimensional hemisphere, E* corre- 
sponding to the solid m-sphere forming the base, and so, corresponding to 
the fibring of the hemisphere by concentric n-dimensional hemispheres, E"*! 
can be fibred by a family of n-cells E,"(t) such that S*-'(¢) is the frontier 
of E,"(t). 

Now define the family M’(t) of subsets of M as follows: 


M'(t) = M(t) fori <c’ —e: 
M’'(t) = (M(t) — EX(T)) U Ex") force’ —ect<dc; 
M'(t) = M(t) — S( force’ <t<e; 


M'(t) = M(t?) fort > c. 


Having done this, let M’ be the set in (.V + 1)-space such that M’(t) is 
its section by the hyperplane xy; = ¢. It is clear that M’ can be adjusted 
to become a differentiable manifold, and that M,and M, will form its boundary. 
The singular sections of M’ by members of the pencil xy,; = ¢ correspond to 


























MODIFICATIONS AND COBOUNDING MANIFOLDS 519 


a sequence of spherical modifications leading from M, to M». These modifica- 
tions are the same as the original ones (corresponding to the singular sections 
of M) with the exception that ¢ has now dropped out, and the section corre- 
sponding to the isolated point at P is no longer there. By means of a finite 
number of steps as just described, all singular sections with isolated points 
can be removed. 

In connection with the proof of this theorem it should be noted that the 
manifold M’ is homeomorphic to M. 

The results of Theorems 3 and 4 can now be combined to give a stronger 
form of Theorem 1. 


THEOREM 5. Let the n-dimensional differentiable manifolds M, and M, form 
the boundary of the differentiable manifold M. Then M can be embedded in 
N-space, for sufficiently large N, as part of a real algebraic variety, M lying 
entirely between the hyperplanes xy = 0 and xy = 1. Only a finite number of 
sections by hyperplanes xy = c (0 < c < 1) will have singular points, one point 
on each such section, and none of these singular points will be an isolated point 
of the section in question. Finally the embedding can be arranged in such a way 
that, in the sequence of modifications leading from M, to M2, corresponding to 
the singular sections of M, all the (r,n — r — 1)-modifications come before the 
(s,m — s — 1)-modifications for each pair of integers r,s with r < s. 


Proof. The first part of the theorem is simply Theorem 1. The absence 
of isolated points on the singular sections of M can be brought about by 
Theorem 4, and the ordering of the modifications according to type can be 
done by repeated application of Theorem 3. 

A further point to notice in connection with the last theorem is that the 
modifications of any one type can be rearranged freely among themselves. For 
consider the modifications of type (r, 2 — r — 1) with 2r < nm (this inequality 
imposes no restriction since in the contrary case one can look at the sequence 
of modifications the other way round, starting from M,.). Repeated application 
of Theorem 3 will rearrange these modifications in any preassigned way. The 
question of identifying the modifications as they are permuted is settled by 
noting that, since 27 < n, there is a set of disjoint r-spheres each to be shrunk 
by one of the modifications, and the modifications can be named according 
to the sphere shrunk. 

Theorem 5 is a generalization of a well-known result concerning orientable 
3-manifolds. Let M be an orientable 3-manifold with boundary formed by 
M, and Mz, arranged as in Theorem 5; no generality is lost here since a 3- 
manifold can be triangulated and then smoothed to give a differentiable 
manifold. The only modifications leading from M, to M2 as in Theorem 5 
will be of types (0,1) and (1,0), all those of the former type being done 
first. If now M is a closed manifold, it can be assumed to be contained between 
the hyperplanes xy = 1 + € and xy = — e¢, for small positive ¢, while the 
sections of M by the hyperplanes xy = 1 and xy = 0 will be 2-spheres M, 











520 ANDREW H. WALLACE 


and M,, boundaries of 3-cells EZ, and E, which lie respectively in the sets 
l1<xy <1+eand —e< xy <0. Theorem 5 then implies that M is 
obtained by applying (0, 1)-modifications to the surfaces of EZ; and Ez, filling 
the surfaces in as one goes to obtain two solids, whose boundaries are then 
identified. Since M is orientable, all the (0,1)-modifications are of orientable 
type (Theorem 2), and so the solids obtained are solid spheres with handles. 
That is to say the manifold M is constructed by taking the union of two solid 
handled spheres (necessarily of the same genus) and identifying their bound- 
aries (6, p. 219). 

Clearly Theorem 5 gives a similar way of constructing a non-orientable 
3-manifold. In this case, however, at least one of the modifications applied 
to the surfaces M, and My, must be of non-orientable type. Thus the two 
solids which are to be put together to form M must each have at least one 
handle twisted (in the manner of the Klein surface). 

To formulate Theorem 5 as a generalization of this classical result on 
3-manifolds, a generalized handled sphere can be defined as an (mn + 1)- 
dimensional solid obtained from a solid (m + 1)-sphere by applying to its 
surface (r,m — r — 1)-modifications, with r<m—r-—1, filling out the 
surface at each stage to form an (m + 1)-solid. Then Theorem 5 implies that 
any differentiable (m + 1)-manifold can be expressed as the union of two 
generalized handled spheres with boundaries identified. In particular if M 
is orientable, all the (0, 2 — 1)-modifications involved will be of orientable 
type. 


5. Complementary modifications. Let M, be a differentiable n-mani- 
fold and let S’ be a directly embedded r-sphere to be shrunk by a spherical 
modification @. Suppose also that S’ is the boundary of an (r + 1)-cell non- 
singularly and differentiably embedded in M,. When B, = S’ X E*~’ is 
removed from M,, the remaining set will contain an (r + 1)-cell £,’*+! with 
boundary S’ X {p} for some p € S*-*-! = FrE*~’. When B; = E,’*! K S*""! 
is added to make the modification ¢, E,’*' joins up with £,’*' to form a 
sphere S’t+! in My,. S’*' is not necessarily directly embedded in M2, but a 
sufficient condition for direct embedding is that the natural (the precise 
meaning of this overworked word in this context is explained below) product 
structure of the normal bundle of E’*+! in M, should induce the product 
structure on B, associated with the modification ¢. If this condition is satis- 
fied, a second modification ¢’ can be performed, shrinking S’*' and transforming 
M:z into a manifold M3. 


LEMMA 5.1. Under the conditions just described M, and M3; are homeomorphic. 


Proof. A normal neighbourhood (union of normal geodesic elements) of 
E’*' in M, is an n-cell E,", and it can be assumed that, in the modification @ 
carrying M, into M2, the complement of £," in M, is left unchanged. In the 
proof of this lemma, therefore, nothing is lost if M, — E," is replaced by a 














W 











MODIFICATIONS AND COBOUNDING MANIFOLDS 521 


second n-cell E," so that EZ," U E," is an n-sphere. The modifications are to 
be carried out on this sphere in such a way that £," is left unchanged, that 
is to say, so that a neighbourhood of some point is left unchanged. 

At this stage the phrase used above, “natural product structure’’ in a 
neighbourhood of £’*', can be explained. The idea is that, when M, — E,"is 
replaced by E;" to form the sphere S* = E,"  E", and then when the neigh- 
bourhood B, of S’ is removed, the remainder of S" will be a product £,’"*!xS*-"""! 
having the cell £,’*' as one of its cross-sections. The normal neighbourhood 
of E’*! will then be the product E’+' X U, where U isa cellular neighbourhood 
on S"*-’-!, with B, added on. 

The proof of the lemma will now be completed by performing the modifi- 
cations ¢ and @’, related as described above, on the n-sphere S", and showing 
that the final result is again S". 

S" can be written as B; U (E,’*! K S*"-!) = (S’ KX E*™'") U (E\"*! XS"), 
where the boundaries of the two products are identified. A point (p, g) is to 
be selected in the interior of (£,’*+! K S*-"-'), and it is to be checked that 
at each stage a neighbourhood of (, g) is left invariant. The modification ¢ 
replaces B, by a product E,’*' K S*-’-'. Thus, with the boundaries of the 
products identified, Mz, = (£,’*+! K S*’-") U (E,"*! kK S*"-'). It is clear 
that a neighbourhood of (p, g) has been left invariant here. Also the identifi- 
cation of the boundaries of the products is such that M, is homeomorphic 
to S’t! x S*-"-!. Now S*"—' can be written as a union E,;*~"—' U E,*-"—' of 
two cells, with g in the interior of E,"-’-'. Thus, the boundaries of the 
products being identified, M, = (S’*! K E,"-’"-') U (S"*! X E,"-"-"'), and the 
first product is the normal bundle of S’*'. Thus ¢’ consists in replacing this 
product by (£’+? X S*-"-*), and the result is S"; also in the process a neigh- 
bourhood of (, g) is left invariant, and so the proof is completed. 

If the situation described in the above lemma holds, the modification ¢’ 
will be called complementary to ¢. 

One case in which this situation will always hold is where ¢ is a (0, m — 1)- 
modification of orientable type. Thus if only the result of a sequence of 
modifications is of interest (and not the manifold bounded by the initial and 
final states) every orientable (0, 2 — 1)-modification can be replaced by a 
(n — 2, 1)-modification. 

An important special case of this result is obtained by taking M, to be an 
orientable 3-dimensional manifold. According to Thom’s theory of cobounding 
manifolds, M, is the boundary of an orientable 4-dimensional manifold. Hence, 
by Theorem 2, M, can be obtained from a 3-sphere Mz by a sequence of 
(0, 2)-, (1, 1)-, and (2, 0)-modifications, those of types (0,2) and (2,0) all 
being orientable. By the result just obtained, the modifications of types (0, 2) 
and (2,0) can all be replaced by modifications of type (1, 1). Translating 
into simple geometrical language the meaning of a (1,1)-modification, the 
following theorem is proved, giving an affirmative answer to a problem of 


Bing (2): 











522 ANDREW H. WALLACE 


THEOREM 6. Any orientable 3-manifold can be obtained from a 3-sphere by 
removing a finite number of disjoint tori and refilling the resulting holes by tori 
with suitable identification of the boundary surfaces. 


6. Killing the fundamental group. The object of this section is to show 
that a manifold which is orientable and of dimension n can always be carried 
into a simply connected manifold by a finite sequence of spherical modifica- 
tions of type (1, m — 2). This having been done, the next section will show 
how, under certain conditions, this process can be extended to one which 
will kill all the homotopy, or what in this context is the same thing, the 
homology groups up to the dimension n — 1. 

The results of this section will be obtained by comparing the fundamental 
groups of two orientable n-dimensional manifolds M, and Mz which are 
related by a single spherical modification ¢ of type (r, m — r — 1) (necessarily 
orientable in case r = 0 or m — 1). As in §2, M, and M;, together will con- 
stitute the boundary of an (m + 1)-dimensional manifold M which can be 
assumed to lie on an (m + 1)-dimensional real algebraic variety in Euclidean 
N-space. It is convenient here to arrange the co-ordinates in such a way that 
M, and M; are, respectively, the sections of M by the hyperplanes xy = — 1 
and xy = 1, while xy = 0 is the singular section of M corresponding to the 
modification leading from M, to M,. The singular point P of this section 
can be taken as origin. As in § 2 there will be a family F of curves cutting 
transversally across the sections of M by the hyperplanes xy = c, except 
at P. The members of the family F passing through P form two cells E’*! 
and E*~’, the former lying in the set xy < 0 and having as its boundary the 
sphere S’ in M, shrunk by the modification ¢, while the latter lies in xy > 0 
and has as its boundary the sphere S*-’-' in M; introduced by ¢. 

The most convenient way of comparing the fundamental groups of M, and 
M; is to compare them both with that of Mo, the section of M by xy = 0. This 
will be done by means of the two mappings f; : M,—> Mo (i = 1, 2) defined 
by setting f;(p) equal to the point on M, and on the curve of F through p. 
These are continuous mappings (10, ch. I1), and so induce homomorphisms 
fn: mi1(M,) — 2(Mo) (i = 1, 2). Here x; denotes the fundamental group, and 
in the meantime M, and M; will be assumed to be connected. The following 
lemma will now be proved. 


LemMaA 6.1. (1) For 1 <r < nm — 1, fix is an isomorphism onto. 


(2) For r = 1, fix is onto and its kernel is generated by the image of 1,(S’) in 
1(M,) induced by the inclusion mapping. 


(3) For r = 0, fix is an isomorphism into. 


Proof. Let a be a closed path on M, beginning and ending at a base point 
p on S’, and suppose that f;(a) is homotopic to a constant on My with respect 
to the fixed base point P = f,(p). It is clear then that a is homotopic to a 














tort 


int 

















MODIFICATIONS AND COBOUNDING MANIFOLDS 523 


constant on M with respect to the fixed base point p. That is to say, there 
is a continuous mapping / of a 2-cell E* into M such that the restriction 
of h to the circumference S' of E* coincides with a (S' is being identified with 
a line segment with ends joined; this is really a description of free homotopy, 
but for the present purpose no distinction need be made). Now hk can be 
assumed to be an algebraic mapping. This is done by noting that, under the 
given h, co-ordinates in the ambient V-space are given as continuous functions 
of the co-ordinates in a 2-space containing E*. Approximating these functions 
by polynomials and then projecting normally into M the required result is 
obtained (4). At the same time h can be adjusted so that h(E*), now a piece 
of algebraic surface in M, bears a simple relation to E’*' and E*~’, which can 
themselves be assumed to be pieces of algebraic subvarieties of M. Namely, 
it can be assumed that, if 0 < r <n —1, h(E") meets E’+! U E*’ in at most 
finitely many points, while if r = 0 or m — 1 the intersection may also include 
some arcs of algebraic curves. These cases will now be considered in more 
detail. 

First take the case where 0 < r < m — 1. When the adjustments described 
above have been made it can be assumed that there is at most a finite set 
P,, Ps, ..., Pm of points in the interior of E* such that A(P,) is on E’*! or 
E*-'’. If the adjustment to / is sufficiently small the new path a will of course 
be homotopic to the original one. It can also clearly be arranged that exactly 
one point g of the boundary S' of E* is mapped on p by h. Now let U be a 
small preassigned neighbourhood of P in M, and let W be the point-set 
union of all the curves of the family F which meet U. It is not hard to see 
that W is a neighbourhood of E’*+!\U E*-’ in M. Since h is continuous it 
follows that there are neighbourhoods U;, U2,..., Um of Pi, P2,..., Pm in 
E*, which can in fact be assumed to be non-overlapping circular discs, such 
that for each i,h(U,;) C W. From q draw an arc 8; to some point on the 
circumference of U,, for each i, arranging that the 8, do not meet each other 
except at g. Let 8 be the closed path on E? starting at g and going along §,, 
round the circumference of U,; and back along 8, for each 7 in turn. This can 
be done so that 8 is homotopic on E? — LU U, to the path which makes a 
single circuit of S'. It then follows that a’ = h(8) is a path on M homotopic 
in M — E’+' — E*"’ to a, with respect to the fixed base point p. In fact the 
deformation of @ into a’ is carried out in M — W, with possibly a small 
neighbourhood of p added on. But, making use of the family F of curves, it 
can be seen that M, — (M,(\ W), along with a small neighbourhood of p, 
is a deformation retract of this set (cf. (10), p. 17), and from this it follows 
that a is homotopic in M,, with respect to the base point p, to the path g(a’), 
where g maps a point ¢ of M — W on the end point, on M,, of the curve 
of F through ¢. g(a’) isa product of paths of the type y ay ;-' where y; = gh(8,), 
and the a; are closed paths in a small neighbourhood of S’, a neighbourhood 
which can be assumed to be a product of S’ by a cell. Since r > 0, an easy 
transformation makes the y; into closed paths based on p. 











524 ANDREW H. WALLACE 


Then if r > 1, the a, are all homotopic to a constant on M, (in fact in a 
neighbourhood of S’), and so in this case it has been shown that the kernel 
of fis is the identity. On the other hand, if r = 1, the a, represent elements 
of the injection image of 2,(S"), as required in the statement of the lemma. 

The kernel of f:* must now be shown to be the identity in the cases r = 0 
and r = n — 1. When r = 0, h(E’) can be assumed to meet E’*', a 1-cell, 
only at the point p, h(S') will not meet E*-’, but h(E”) may meet E*~’ in 
some curves. In this case, in addition to the points P, appearing in the above 
discussion, there may be some algebraic curves in the interior of E* carried 
by h into E*~’. Since h is continuous, it is in this case possible to find a finite 
number of simple closed loops C, in E*, each surrounding one or more of 
these curves, and each lying within such a small neighbourhood of these 
curves that h(C,) C W for each i. The P,; not already surrounded by the 
C, are to be given neighbourhoods U, as before, and the U, and C, are not 
to meet each other. The argument as above is then repeated, using the C, 
along with the circumferences of the U;. 

The case r = nm — 1 is a little more complicated. h(E*) will meet E"~’ at 
most in a finite number of points (and this need only happen if m = 2), but 
it may meet £’*' in both isolated points and in pieces of algebraic curves, 
some of which may be arcs with end points on a. The inverse images of these 
arcs will be arcs of algebraic curves with end points on S'. A preliminary 
adjustment will be made this time, deforming the mapping / in such a way 
that all these end points coincide with g. There are now in £? isolated points, 
isolated curves in the interior of E*, and a set of curves forming a connected 
set containing g, all mapped into E’+'  E*-’ by h. The isolated points and 
curves in the interior of E? are to be treated as in the case r = 0, and the 
remaining curve is to be surrounded by a simple closed loop beginning and 
ending at g and lying in such a small neighbourhood of the curve that it is 
mapped by & into W. This loop is to be included in the product of paths 
forming 8, and the rest of the argument is the same as before. 

To complete the proof of the lemma it must be shown that fi is onto 
except in the case r = 0; it obviously will fail to be onto in this case. If then 
r £0, let a be a closed path on Mo, and it is convenient this time to take 
as base point for closed paths a point Q different from P. a is then homotopic 
in M toa path a; not meeting E"~’; this is possible since r 0. Let a2 and a; 
be the projections of a; on My and M, respectively along the curves of F. The 
point f;~'(Q) is well defined and will be taken as base point for closed paths 
on M,. Clearly a2 = f;(a3). On the other hand, a2 is homotopic in M, with 
respect to the base point Q, to a; and hence to a. But, using the curves of F, 
M, is a deformation retract of M (10, ch. I, § 4) and so a2 and @ are homotopic 
in Mo. Hence f,+ carries the homotopy class of a; in M, into that of a in Mo, 
and this shows f;« to be onto for r # 0. This completes the proof of the lemma. 

The case in which M, (or similarly M:z) is not connected is dealt with as 
follows: 








cat 








MODIFICATIONS AND COBOUNDING MANIFOLDS 525 


LemMA 6.2. Continuing with the notation of the last lemma, let ¢ be a (0, n — 1)- 
modification, let Mz be connected but let M, consist of two connected com- 
ponents M,' and M,". Then the fundamental group of My is the free product 
of the images under f,« of those of M,' and M,"". 


Proof. This is a well known result, but is also easy to derive in the manner 
of the last lemma. 

Applying the above lemmas also to fox and putting the results together, 
the following theorem is at once obtained. 


THEOREM 7. Let M, and Mz be two n-dimensional orientable differentiable 
manifolds related by a spherical modification @ of type (r,n — r — 1). Then 

(1) ofl <r < m — 2, 21(M,) and x,(M:2) are isomorphic under fox-'f ix. (This 
can only happen if n > 4.) 

(2) If r = 1 and n > 3, fea'fix is @ homomorphism of 2:(M,) onto r;(M2) 
with kernel generated by the image of ,(S') induced by the inclusion of S' in M,, 
S' being the 1-sphere shrunk by 9. 

(3) If r= 0 and n> 2, and M, is connected, fox-'fix is an isomorphism 
into. If M, has two components, 4,(M:2) 1s the free product of their fundamental 
groups. 


Complementary results to (2) and (3) can of course be obtained by taking 
r=n-—1 or n — 2. The condition m > 2 in (3) is no great obstacle, as 
modifications on a surface are rather a trivial matter. On the other hand 
the restriction » > 3 in (2) shows up one of the essential difficulties of the 
3-dimensional case, where a modification which shrinks one circle simply 
has the effect of introducing another. 

Suppose now that M, is a compact orientable differentiable manifold of 
dimension n > 3. #,;(M)) is a finitely generated group in this case, and the 
generators can be assumed to be carried by a finite collection of disjoint 
l-spheres differentiably and, of course, directly embedded in M,. Performing 
the modifications which shrink these l-spheres, and using part (2) of the 
above theorem, we have the following theorem. 


THEOREM 8, An orientable compact differentiable manifold of dimension > 3 
can be made simply connected by a finite sequence of (1, n — 2)-modifications. 


Note that, according to Theorem 6 and the remarks preceding it, the 
condition m > 3 can be dropped. But there is no guarantee in the case of 
n = 3 that the modifications involved correspond to a systematic killing of 
the generators of the fundamental group. 


7. Killing the homology groups. The aim of this section is to give a 
partial extension of the results of the last section to the homology and homo- 
topy groups of dimension higher than the first. The idea is that a cycle 
carried by a directly embedded sphere can be annulled by the modification 








526 ANDREW H. WALLACE 


which shrinks that sphere. But the condition imposed here on the cycle is a 
rather strong one, and so no sort of complete theory is possible until the 
situation has been analysed in much greater detail. The ideal result would 
be to achieve a complete “‘killing’’ by adding to the given manifold suitable 
auxiliary manifolds, namely representatives of the generators of the Thom 
cobounding groups but, in the meantime, a few of the simpler cases will be 
treated. 

Let ¢ be a spherical modification of type (r, m — r — 1) carrying the differ- 
entiable manifold M, into M:, shrinking the sphere S’ C M, and introducing 
S*-"-! C Mz. Let B,; and B; be the normal bundles of S’ and S*-’-' in M, 
and M:2, both, of course, topological products. Using singular homology with 
integral coefficients, an application of the homotopy and excision theorems 
shows that H,(M,, S’) > H,(M, — B,;, FrB;), and H,(M2,S*-"-') =H, 
(M; — B2, FrB:), for all ». On the other hand ¢ induces a homeomorphism 
between M, — B,; and M2 — Bz, and so it follows that H,(M,, S’) = H,(M;z, 
S*-"-') for all p. The results to be obtained now depend on the examination 
of the following diagram, in which the horizontal lines are the appropriate 
homology sequences: 


(7) —H,(S’) *%H,(Mi)® 4H,(M,,S') *%H,.(S')-> 
dll 
— H,(S*-"-") > H,(M:) > H,(M2, S*"") ~ Hy-(S*”') -. 
t’p i’p a’, 


The proofs of the following lemmas are immediate. 


LEMMA 7.1. In the above notation if 2r < n — 1 (thatisr <n — r — 1) then 
H,(M,) = H,(M:) for p < r and H,(M.2) = H,(M))/i,H,(S’). 


Obviously there is a complementary result for r > n — r — 1, amounting 
simply to looking at ¢ as leading from M; to M,. If in the lemma just proved 
S’ carries a representative of some generator of H,(M;,), then the lemma 
shows that the effect of ¢ is to annul that generator. 


LemMaA 7.2. [fr +1<p<n—r-—1,H,(M,) = 4H,(M2), and except when 
n is even and equal to 2(r + 1), H,4:(M;,) can be identified with a subgroup of 
H,41(M2) and the quotient group is isomorphic to the kernel of i,. 


In particular, this shows that, if the cycle carried by S’ is homologous to 
zero in M, or isa torsion cycle, the effect of the modification (with the excep- 
tion noted) is to add another generator to the (r + 1)st homology group. 
These two lemmas show two of the characteristic ways in which a modifica- 
tion can affect homology. Note that, if M@, and M;, are simply connected and 
r > 1 (in addition to the conditions already imposed on it) then the above 
results, by the Hurewicz isomorphism theorem, can be interpreted in terms 
of the homotopy groups provided that all the lower dimensional homology 
groups are already known to be zero. 











| 
| 


col 





len 








MODIFICATIONS AND COBOUNDING MANIFOLDS 


or 
to 
~sI 


The cases in which the condition 27 < m — 1 fails will now be examined. 
This will have to be done separately in the two cases m odd and n even. First 
consider an odd value 2m + 1 of the dimension; the case to be looked at 
then corresponds to the value m of r. 


LEMMA 7.3. In the situation just described, H,(M:2) = H,(M;) for all p < m. 
If the image of im is not of finite order in H,,(M,) then the effect of @ on M, is 
to reduce the mth Betti number by 1, but possibly to introduce a new torsion 
cycle. 


Proof. The first statement, concerning » < m follows at once from the 
diagram (7). Next, if the image of 7, is not of finite order in H,,(M,), the 
fundamental cycle a of S" will be homologous in M, to ka;, where & is an 
integer and a, belongs to a Betti basis for M,. Using a dual basis, it follows 
that there is a cycle 8 on M, such that 8.a = k. Now 6 can be chosen as a 
linear combination of singular simplexes on M, each of which either does not 
meet S” or has exactly one interior point in common with S”. If the latter 
simplexes are removed a relative cycle of M, — B, modulo FrB, is obtained 
whose boundary is easily seen to be ky, where y is a fundamental cycle of 
the m-sphere S," in M;, introduced by the modification. Clearly ky is homo- 
logous to zero in M2. Now the diagram (7) gives the isomorphism 


Hn ( M1) /imEm(S") = Hm( M2) /inlm (ST). 


Since the images of i, and i,’ have generators represented respectively by 
a and y, the result stated follows at once. 

There is obviously a complementary result to the above, starting with 
the assumption that a@ is a torsion cycle in M;; this is not an essentially different 
result, but simply consists in reversing the parts played by M, and M, in 
the above. 

Consider next an even value 2m for n. The inequality 2r < nm — 1 is equiva- 
lent to r < m, and so again the case requiring special attention is r = m. 


LEMMA 7.4. If the image of im, is not of finite order the effect of the modification 
is to decrease the mth Betti number by 2. 


Proof. \f a is the fundamental cycle of S” there is a cycle 6 on M, such 
that 6.a #0. Then, reasoning as in the last lemma, it follows that the 
modification annuls the homology classes, over rational numbers, of a and 8; 
these classes are certainly different, for, since S” is directly embedded, a.a = 0. 

Lemma 7.4 could be formulated more completely by describing the effect 
of the modification on torsion, but as there is no immediate application it 
does not seem worth while. In any case the most suitable situation for applying 
this result would be where the lower dimensional homology groups were all 
zero, when the m-dimensional torsion group would also automatically vanish. 











528 ANDREW H. WALLACE 


REFERENCES 


. A. Aeppli, Modifikationen von reellen und komplexen Mannigfaltigkeiten, Comm. Math. 


Helv., 31 (1957), 219-301. 


. R.H. Bing, Necessary and sufficient conditions that a 3-manifold be S*, Ann. Math., 68 (1958), 


17-37. 


. F. Hirzebruch, Neue topologische Mcthoden in der algebraischen Geometrie, Ergebnisse der 


Mathematik (Berlin, 1956). 


. J. Nash, Real algebraic manifolds, Ann. Math., 56 (1952), 405-421. 
. P. Samuel, Sur l’algébricité de certains points singuliers, J. math. pures et appl., 35 (1956), 


1-6. 


. H. Seifert and W. Threlfall, Lehrbuch der Topologie (New York, 1934). 


——— Variationsrechnung im Grossen (New York, 1951). 


. R. Thom, Quelques propriétés globales des variétés différentiables, Comm. Math. Helv., 28 


(1954), 17-86. 
. A. H. Wallace, Algebraic approximation of manifolds, Proc. Lond. Math. Soc., 7 (1957), 
196-210. 
—— Homology theory on algebraic varieties, (London, 1957). 


Indiana University 














